第9章:高可用架构

9.1 Nginx集群部署

高可用(High Availability,HA)架构确保系统在面对硬件故障、软件故障或网络故障时仍能保持正常运行。Nginx高可用架构通常包括多个Nginx实例和负载均衡机制。

9.1.1 使用Keepalived实现高可用

Keepalived是一个基于VRRP(Virtual Router Redundancy Protocol)协议的高可用解决方案,用于实现IP地址的故障转移。

9.1.1.1 架构设计

主从架构

  • 主Nginx节点:处理所有请求
  • 从Nginx节点:作为备份,当主节点故障时接管服务
  • Keepalived:监控Nginx状态,实现IP故障转移

虚拟IP(VIP)

  • 客户端通过虚拟IP访问服务
  • 虚拟IP在主节点上,当主节点故障时自动转移到从节点

9.1.1.2 安装和配置

环境准备

  • 主节点:IP地址 192.168.1.100
  • 从节点:IP地址 192.168.1.101
  • 虚拟IP:192.168.1.200

安装Keepalived

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y keepalived

# CentOS/RHEL
sudo yum install -y keepalived

配置主节点

sudo nano /etc/keepalived/keepalived.conf

添加以下内容:

! Configuration File for keepalived

global_defs {
   notification_email {
     admin@example.com
   }
   notification_email_from keepalived@example.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_MASTER
}

# 监控Nginx服务的脚本
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
    interval 2
    weight -10
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    
    # 认证配置
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    # 虚拟IP配置
    virtual_ipaddress {
        192.168.1.200/24 dev eth0
    }
    
    # 监控脚本
    track_script {
        check_nginx
    }
    
    # 通知脚本
    notify_master "/etc/keepalived/notify.sh master"
    notify_backup "/etc/keepalived/notify.sh backup"
    notify_fault "/etc/keepalived/notify.sh fault"
}

配置从节点

sudo nano /etc/keepalived/keepalived.conf

添加以下内容:

! Configuration File for keepalived

global_defs {
   notification_email {
     admin@example.com
   }
   notification_email_from keepalived@example.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_BACKUP
}

# 监控Nginx服务的脚本
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
    interval 2
    weight -10
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 90
    advert_int 1
    
    # 认证配置
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    # 虚拟IP配置
    virtual_ipaddress {
        192.168.1.200/24 dev eth0
    }
    
    # 监控脚本
    track_script {
        check_nginx
    }
    
    # 通知脚本
    notify_master "/etc/keepalived/notify.sh master"
    notify_backup "/etc/keepalived/notify.sh backup"
    notify_fault "/etc/keepalived/notify.sh fault"
}

创建Nginx监控脚本

sudo nano /etc/keepalived/check_nginx.sh

添加以下内容:

#!/bin/bash

# 检查Nginx进程是否存在
if ps aux | grep -v grep | grep nginx > /dev/null; then
    exit 0
else
    # 尝试重启Nginx
    sudo systemctl restart nginx
    
    # 等待2秒后再次检查
    sleep 2
    
    if ps aux | grep -v grep | grep nginx > /dev/null; then
        exit 0
    else
        # Nginx重启失败,返回非零退出码
        exit 1
    fi
fi

创建通知脚本

sudo nano /etc/keepalived/notify.sh

添加以下内容:

#!/bin/bash

# 通知脚本,用于记录状态变化

EMAIL="admin@example.com"
VIP="192.168.1.200"

case $1 in
    master)
        SUBJECT="[Keepalived] $HOSTNAME became master for $VIP"
        MESSAGE="$HOSTNAME is now the master node for VIP $VIP"
        echo "$MESSAGE" | mail -s "$SUBJECT" $EMAIL
        ;;    
    backup)
        SUBJECT="[Keepalived] $HOSTNAME became backup for $VIP"
        MESSAGE="$HOSTNAME is now the backup node for VIP $VIP"
        echo "$MESSAGE" | mail -s "$SUBJECT" $EMAIL
        ;;    
    fault)
        SUBJECT="[Keepalived] $HOSTNAME is in fault state for $VIP"
        MESSAGE="$HOSTNAME is now in fault state for VIP $VIP"
        echo "$MESSAGE" | mail -s "$SUBJECT" $EMAIL
        ;;    
    *)
        echo "Unknown state: $1"
        exit 1
        ;;
esac

设置脚本权限

sudo chmod +x /etc/keepalived/check_nginx.sh
sudo chmod +x /etc/keepalived/notify.sh

启动Keepalived服务

# 主节点和从节点都执行
sudo systemctl start keepalived
sudo systemctl enable keepalived

9.1.1.3 验证高可用配置

1. 检查虚拟IP

# 在主节点上执行
ip addr show eth0

# 预期输出中应该包含虚拟IP
inet 192.168.1.200/24 scope global secondary eth0

2. 测试故障转移

# 在主节点上停止Nginx服务
sudo systemctl stop nginx

# 检查虚拟IP是否转移到从节点
# 在从节点上执行
ip addr show eth0

# 预期输出中应该包含虚拟IP
inet 192.168.1.200/24 scope global secondary eth0

# 测试虚拟IP是否可访问
ping -c 3 192.168.1.200
curl -I http://192.168.1.200

3. 恢复主节点

# 在主节点上启动Nginx服务
sudo systemctl start nginx

# 检查虚拟IP是否转移回主节点
ip addr show eth0

9.1.2 DNS负载均衡

DNS负载均衡是通过DNS服务器将域名解析到多个IP地址,实现请求的分发。

9.1.2.1 配置方法

在DNS服务器上添加多个A记录

example.com.    IN    A    192.168.1.100
example.com.    IN    A    192.168.1.101
example.com.    IN    A    192.168.1.102

客户端请求流程

  1. 客户端向DNS服务器请求example.com的IP地址
  2. DNS服务器返回多个IP地址
  3. 客户端选择其中一个IP地址发送请求
  4. 如果该IP地址不可用,客户端尝试使用其他IP地址

9.1.2.2 优缺点

优点

  • 简单易用,无需额外设备或软件
  • 支持全球负载均衡
  • 客户端自动故障转移

缺点

  • DNS缓存可能导致故障转移延迟
  • 无法根据服务器负载动态调整
  • 不支持会话保持

9.1.3 硬件负载均衡器

硬件负载均衡器是专门用于负载均衡的网络设备,如F5 BIG-IP、Citrix NetScaler等。

优点

  • 高性能,支持高并发
  • 丰富的功能,如SSL卸载、DDoS防护等
  • 可靠性高,通常具有冗余设计

缺点

  • 成本高
  • 配置复杂
  • 扩展性有限

9.2 微服务API网关

API网关是微服务架构中的重要组件,用于统一管理和路由API请求。Nginx可以作为API网关,提供请求路由、认证授权、限流熔断等功能。

9.2.1 基于位置的请求路由

配置示例

http {
    # 定义后端服务
    upstream service1 {
        server 127.0.0.1:3000;
        server 127.0.0.1:3001;
    }
    
    upstream service2 {
        server 127.0.0.1:4000;
        server 127.0.0.1:4001;
    }
    
    upstream service3 {
        server 127.0.0.1:5000;
        server 127.0.0.1:5001;
    }
    
    server {
        listen 80;
        server_name api.example.com;
        
        # API版本路由
        location /api/v1/service1/ {
            proxy_pass http://service1/;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
        
        location /api/v1/service2/ {
            proxy_pass http://service2/;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
        
        location /api/v1/service3/ {
            proxy_pass http://service3/;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
        
        # 健康检查端点
        location /health {
            return 200 "OK";
        }
    }
}

路由规则

  • /api/v1/service1/ → service1(端口3000和3001)
  • /api/v1/service2/ → service2(端口4000和4001)
  • /api/v1/service3/ → service3(端口5000和5001)
  • /health → 健康检查端点

9.2.2 认证与授权集成

API网关可以集成认证和授权机制,保护后端服务的安全。

9.2.2.1 JWT认证

配置示例

http {
    # 定义后端服务
    upstream backend {
        server 127.0.0.1:3000;
    }
    
    # JWT验证配置
    server {
        listen 80;
        server_name api.example.com;
        
        # 公共端点,无需认证
        location /public/ {
            proxy_pass http://backend/;
        }
        
        # 需要JWT认证的端点
        location /api/ {
            # 验证JWT token
            auth_request /auth;
            
            # 将JWT claims传递给后端
            auth_request_set $user_id $upstream_http_x_user_id;
            proxy_set_header X-User-ID $user_id;
            
            proxy_pass http://backend/;
        }
        
        # JWT验证端点
        location = /auth {
            internal;
            proxy_pass http://backend/auth/verify;
            proxy_pass_request_body off;
            proxy_set_header Content-Length "";
            proxy_set_header X-Original-URI $request_uri;
        }
    }
}

JWT验证流程

  1. 客户端发送请求,包含Authorization头:Bearer <token>
  2. Nginx将请求转发到/auth端点进行JWT验证
  3. 验证成功后,后端返回200 OK,并在响应头中包含用户信息
  4. Nginx将用户信息添加到请求头中,转发到后端服务
  5. 验证失败时,返回401 Unauthorized

9.2.2.2 OAuth 2.0集成

Nginx可以作为OAuth 2.0客户端或授权服务器,集成OAuth 2.0认证流程。

配置示例

http {
    # OAuth 2.0配置
    server {
        listen 80;
        server_name api.example.com;
        
        location / {
            # OAuth 2.0认证
            auth_request /oauth2/auth;
            error_page 401 = @error401;
            
            proxy_pass http://backend;
        }
        
        # OAuth 2.0认证端点
        location = /oauth2/auth {
            internal;
            proxy_pass https://oauth2-server.com/auth;
            proxy_pass_request_body off;
            proxy_set_header Content-Length "";
            proxy_set_header X-Original-URI $request_uri;
        }
        
        # 错误处理
        location @error401 {
            return 302 https://oauth2-server.com/login?redirect_uri=http://$host$request_uri;
        }
    }
}

9.2.3 请求/响应转换

API网关可以对请求和响应进行转换,实现不同服务之间的数据格式兼容。

配置示例

http {
    # 请求转换示例:添加API版本头
    server {
        listen 80;
        server_name api.example.com;
        
        location /api/ {
            # 添加API版本头
            proxy_set_header X-API-Version "1.0";
            
            # 转换请求路径
            rewrite ^/api/(.*)$ /$1 break;
            
            proxy_pass http://backend;
            
            # 响应转换示例:修改响应头
            proxy_hide_header X-Powered-By;
            add_header X-Response-Time $upstream_response_time;
        }
    }
}

9.3 实战项目:构建高可用Nginx集群

在这个实战项目中,我们将构建一个高可用Nginx集群,包括两个Nginx节点和Keepalived故障转移。

9.3.1 环境准备

服务器配置

  • 主节点:
    • IP地址:192.168.1.100
    • 主机名:nginx-master
  • 从节点:
    • IP地址:192.168.1.101
    • 主机名:nginx-backup
  • 虚拟IP:192.168.1.200

安装软件

# 在两个节点上都执行
sudo apt-get update
sudo apt-get install -y nginx keepalived

9.3.2 配置Nginx

在两个节点上配置相同的Nginx配置

sudo nano /etc/nginx/nginx.conf

添加以下内容:

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';
    
    access_log /var/log/nginx/access.log main;
    
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    
    server {
        listen 80;
        server_name example.com;
        
        root /var/www/html;
        index index.html;
        
        location / {
            try_files $uri $uri/ =404;
        }
        
        location /health {
            return 200 "OK";
        }
    }
}

创建测试页面

# 主节点
echo "<h1>Nginx Master - 192.168.1.100</h1>" | sudo tee /var/www/html/index.html

# 从节点
echo "<h1>Nginx Backup - 192.168.1.101</h1>" | sudo tee /var/www/html/index.html

9.3.3 配置Keepalived

配置主节点

sudo nano /etc/keepalived/keepalived.conf

添加以下内容:

! Configuration File for keepalived

global_defs {
   router_id NGINX_MASTER
}

vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
    interval 2
    weight -10
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.1.200/24 dev eth0
    }
    
    track_script {
        check_nginx
    }
}

配置从节点

sudo nano /etc/keepalived/keepalived.conf

添加以下内容:

! Configuration File for keepalived

global_defs {
   router_id NGINX_BACKUP
}

vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
    interval 2
    weight -10
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 90
    advert_int 1
    
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.1.200/24 dev eth0
    }
    
    track_script {
        check_nginx
    }
}

创建Nginx监控脚本

sudo nano /etc/keepalived/check_nginx.sh

添加以下内容:

#!/bin/bash

if ps aux | grep -v grep | grep nginx > /dev/null; then
    exit 0
else
    sudo systemctl restart nginx
    sleep 2
    if ps aux | grep -v grep | grep nginx > /dev/null; then
        exit 0
    else
        exit 1
    fi
fi

设置脚本权限

sudo chmod +x /etc/keepalived/check_nginx.sh

9.3.4 启动服务

# 在两个节点上都执行
sudo systemctl start nginx
sudo systemctl enable nginx

sudo systemctl start keepalived
sudo systemctl enable keepalived

9.3.5 验证高可用集群

1. 检查服务状态

# 检查Nginx状态
sudo systemctl status nginx

# 检查Keepalived状态
sudo systemctl status keepalived

# 检查虚拟IP
ip addr show eth0

2. 测试故障转移

# 测试虚拟IP可访问性
curl http://192.168.1.200

# 在主节点上停止Nginx服务
sudo systemctl stop nginx

# 检查虚拟IP是否转移到从节点
# 在从节点上执行
ip addr show eth0

# 测试虚拟IP仍然可访问
curl http://192.168.1.200

# 恢复主节点
sudo systemctl start nginx

# 检查虚拟IP是否转移回主节点
ip addr show eth0

3. 测试负载均衡

在DNS服务器上添加A记录:

example.com.    IN    A    192.168.1.100
example.com.    IN    A    192.168.1.101

测试DNS负载均衡:

# 多次请求,观察响应
for i in {1..10}; do curl http://example.com; echo; done

9.3.6 常见问题与解决方案

问题1:Keepalived无法启动

解决方案

  1. 检查配置文件语法:

    sudo keepalived -t -f /etc/keepalived/keepalived.conf
  2. 检查日志文件:

    sudo journalctl -u keepalived
  3. 确保虚拟路由器ID(virtual_router_id)在主从节点上相同

问题2:虚拟IP无法访问

解决方案

  1. 检查防火墙设置:

    sudo ufw status
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
  2. 检查Nginx是否监听所有接口:

    listen 0.0.0.0:80;
  3. 检查虚拟IP是否正确配置:

    ip addr show eth0

问题3:故障转移不工作

解决方案

  1. 检查监控脚本是否可执行:

    sudo chmod +x /etc/keepalived/check_nginx.sh
  2. 手动测试监控脚本:

    sudo /etc/keepalived/check_nginx.sh
    echo $?
  3. 检查VRRP协议是否被防火墙阻止:

    sudo ufw allow proto vrrp from any to any

章节总结

在本章中,我们学习了:

  1. Nginx集群部署

    • 使用Keepalived实现高可用
    • DNS负载均衡
    • 硬件负载均衡器
  2. 微服务API网关

    • 基于位置的请求路由
    • 认证与授权集成
    • 请求/响应转换
  3. 实战项目

    • 构建高可用Nginx集群
    • 配置Keepalived故障转移
    • 验证高可用功能
    • 解决常见问题

实践练习

  1. 配置两个Nginx节点的高可用集群,使用Keepalived实现故障转移
  2. 配置DNS负载均衡,将域名解析到多个Nginx节点
  3. 实现基于Nginx的API网关,包含请求路由和认证功能
  4. 测试故障转移和恢复功能
  5. 监控高可用集群的状态

延伸阅读


下一章第10章:故障排查与优化

« 上一篇 日志与监控 下一篇 » 故障排查与优化