第263集:基础设施即代码

教学目标

  • 理解基础设施即代码的概念和优势
  • 掌握Terraform的基本语法和高级特性
  • 熟悉Ansible的配置管理和自动化部署
  • 学习CloudFormation等云原生IaC工具
  • 能够设计和实施IaC的最佳实践

核心知识点

1. 基础设施即代码概述

1.1 IaC的核心原则

  • 声明式编程:描述期望状态,而非如何达到该状态
  • 幂等性:多次执行产生相同结果
  • 版本控制:所有基础设施变更都通过代码管理
  • 自动化测试:对基础设施代码进行测试验证
  • 持续集成/持续部署:将基础设施变更纳入CI/CD流程

1.2 IaC的优势

优势 描述 实际价值
一致性 确保所有环境配置一致 减少环境差异导致的bug
可重复性 快速复制相同的基础设施 加速开发和测试环境搭建
可追溯性 所有变更都有历史记录 便于审计和问题排查
自动化 减少手动操作错误 提高效率和可靠性
文档化 代码即文档 保持文档与实际同步

1.3 主流IaC工具对比

工具 类型 优势 适用场景
Terraform 声明式 多云支持、状态管理、丰富的Provider 跨云基础设施管理
Ansible 配置管理 无代理、易学易用、YAML配置 配置管理和应用部署
CloudFormation 云原生 AWS原生支持、深度集成 AWS基础设施管理
Pulumi 编程式 使用通用编程语言、类型安全 需要复杂逻辑的场景
AWS CDK 编程式 TypeScript/Python等语言支持、AWS深度集成 AWS基础设施开发

2. Terraform基础

2.1 Terraform安装和配置

# 安装Terraform
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform

# 验证安装
terraform version

# 配置AWS凭证
export AWS_ACCESS_KEY_ID="your-access-key-id"
export AWS_SECRET_ACCESS_KEY="your-secret-access-key"
export AWS_DEFAULT_REGION="us-east-1"

# 或者使用AWS CLI配置
aws configure

2.2 Terraform基本语法

# main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  required_version = ">= 1.0"
}

provider "aws" {
  region = var.aws_region
}

variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t2.micro"
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name        = "main-vpc"
    Environment = var.environment
  }
}

resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  map_public_ip_on_launch = true
  availability_zone       = "${var.aws_region}a"

  tags = {
    Name = "public-subnet"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "main-igw"
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }

  tags = {
    Name = "public-rt"
  }
}

resource "aws_route_table_association" "public" {
  subnet_id      = aws_subnet.public.id
  route_table_id = aws_route_table.public.id
}

resource "aws_security_group" "web" {
  name        = "web-sg"
  description = "Allow web traffic"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP from anywhere"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "HTTPS from anywhere"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "SSH from anywhere"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "web-sg"
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type
  subnet_id     = aws_subnet.public.id
  vpc_security_group_ids = [aws_security_group.web.id]

  tags = {
    Name        = "web-server"
    Environment = var.environment
  }
}

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

output "vpc_id" {
  description = "ID of the VPC"
  value       = aws_vpc.main.id
}

output "instance_public_ip" {
  description = "Public IP of the EC2 instance"
  value       = aws_instance.web.public_ip
}

2.3 Terraform工作流程

# 初始化Terraform工作目录
terraform init

# 格式化代码
terraform fmt

# 验证配置
terraform validate

# 查看执行计划
terraform plan

# 应用配置
terraform apply

# 查看状态
terraform show

# 导入现有资源
terraform import aws_instance.web i-1234567890abcdef0

# 销毁资源
terraform destroy

# 查看资源状态
terraform state list

# 刷新状态
terraform refresh

# 输出变量值
terraform output instance_public_ip

3. Terraform高级特性

3.1 模块化设计

# modules/vpc/main.tf
variable "vpc_cidr" {
  description = "CIDR block for VPC"
  type        = string
}

variable "public_subnet_cidrs" {
  description = "CIDR blocks for public subnets"
  type        = list(string)
}

variable "availability_zones" {
  description = "Availability zones"
  type        = list(string)
}

resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "public" {
  count                   = length(var.public_subnet_cidrs)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = var.public_subnet_cidrs[count.index]
  map_public_ip_on_launch = true
  availability_zone       = var.availability_zones[count.index]

  tags = {
    Name = "public-subnet-${count.index + 1}"
  }
}

output "vpc_id" {
  value = aws_vpc.main.id
}

output "public_subnet_ids" {
  value = aws_subnet.public[*].id
}

# main.tf
module "vpc" {
  source = "./modules/vpc"

  vpc_cidr           = "10.0.0.0/16"
  public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
  availability_zones  = ["us-east-1a", "us-east-1b"]
}

3.2 数据源和远程状态

# 使用数据源
data "aws_caller_identity" "current" {}

data "aws_region" "current" {}

data "aws_availability_zones" "available" {
  state = "available"
}

resource "aws_s3_bucket" "terraform_state" {
  bucket = "terraform-state-${data.aws_caller_identity.current.account_id}"

  versioning {
    enabled = true
  }

  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
    }
  }
}

# 配置远程状态
terraform {
  backend "s3" {
    bucket         = "terraform-state-${data.aws_caller_identity.current.account_id}"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

3.3 工作空间

# 创建工作空间
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod

# 切换工作空间
terraform workspace select dev

# 查看当前工作空间
terraform workspace show

# 列出所有工作空间
terraform workspace list

# 在代码中使用工作空间
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = terraform.workspace == "prod" ? "t3.medium" : "t2.micro"
  
  tags = {
    Name        = "web-${terraform.workspace}"
    Environment = terraform.workspace
  }
}

4. Ansible基础

4.1 Ansible安装和配置

# 安装Ansible
sudo apt update
sudo apt install -y ansible

# 验证安装
ansible --version

# 创建Ansible配置文件
cat > ansible.cfg << 'EOF'
[defaults]
inventory = ./hosts
host_key_checking = False
remote_user = ubuntu
private_key_file = ~/.ssh/my-key.pem
timeout = 30

[privilege_escalation]
become = True
become_method = sudo
become_user = root
EOF

# 创建主机清单文件
cat > hosts << 'EOF'
[webservers]
web1 ansible_host=192.168.1.10
web2 ansible_host=192.168.1.11

[databases]
db1 ansible_host=192.168.1.20

[all:vars]
ansible_python_interpreter=/usr/bin/python3
EOF

4.2 Ansible Playbook基础

# site.yml
---
- name: Configure web servers
  hosts: webservers
  become: yes

  tasks:
    - name: Update apt cache
      apt:
        update_cache: yes
        cache_valid_time: 3600

    - name: Install required packages
      apt:
        name:
          - nginx
          - python3-pip
          - git
        state: present

    - name: Create application directory
      file:
        path: /var/www/myapp
        state: directory
        owner: www-data
        group: www-data
        mode: '0755'

    - name: Deploy application
      git:
        repo: https://github.com/username/myapp.git
        dest: /var/www/myapp
        version: main
        force: yes

    - name: Install Python dependencies
      pip:
        requirements: /var/www/myapp/requirements.txt
        executable: pip3

    - name: Configure Nginx
      template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/sites-available/myapp
        owner: root
        group: root
        mode: '0644'
      notify: Restart Nginx

    - name: Enable site
      file:
        src: /etc/nginx/sites-available/myapp
        dest: /etc/nginx/sites-enabled/myapp
        state: link
      notify: Restart Nginx

    - name: Remove default Nginx site
      file:
        path: /etc/nginx/sites-enabled/default
        state: absent
      notify: Restart Nginx

    - name: Ensure Nginx is running
      service:
        name: nginx
        state: started
        enabled: yes

  handlers:
    - name: Restart Nginx
      service:
        name: nginx
        state: restarted

4.3 Ansible Roles

# 创建Role结构
ansible-galaxy init webserver

# 目录结构
# webserver/
# ├── defaults/
# │   └── main.yml
# ├── files/
# ├── handlers/
# │   └── main.yml
# ├── meta/
# │   └── main.yml
# ├── tasks/
# │   └── main.yml
# ├── templates/
# ├── tests/
# │   ├── inventory
# │   └── test.yml
# └── vars/
#     └── main.yml

# tasks/main.yml
---
- name: Update apt cache
  apt:
    update_cache: yes
    cache_valid_time: 3600

- name: Install Nginx
  apt:
    name: nginx
    state: present

- name: Start and enable Nginx
  service:
    name: nginx
    state: started
    enabled: yes

# handlers/main.yml
---
- name: Restart Nginx
  service:
    name: nginx
    state: restarted

# 使用Role
# site.yml
---
- name: Configure web servers
  hosts: webservers
  become: yes
  roles:
    - webserver

5. CloudFormation

5.1 CloudFormation模板基础

# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Web server stack with EC2, VPC, and Security Groups'

Parameters:
  VpcCidr:
    Type: String
    Default: 10.0.0.0/16
    Description: CIDR block for VPC

  InstanceType:
    Type: String
    Default: t2.micro
    AllowedValues:
      - t2.micro
      - t2.small
      - t2.medium
    Description: EC2 instance type

  KeyName:
    Type: AWS::EC2::KeyPair::KeyName
    Description: Name of an existing EC2 KeyPair to enable SSH access

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: !Ref VpcCidr
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-vpc'

  PublicSubnet:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Select [0, !Cidr [!Ref VpcCidr, 6, 8]]
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-public-subnet'

  InternetGateway:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-igw'

  VPCGatewayAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      VpcId: !Ref VPC
      InternetGatewayId: !Ref InternetGateway

  RouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-rt'

  InternetRoute:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref RouteTable
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId: !Ref InternetGateway

  SubnetRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnet
      RouteTableId: !Ref RouteTable

  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Enable HTTP, HTTPS, and SSH access
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0
        - IpProtocol: tcp
          FromPort: 443
          ToPort: 443
          CidrIp: 0.0.0.0/0
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-sg'

  WebServerInstance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: !FindInMap [AWSRegionArch2AMI, !Ref 'AWS::Region', AMI]
      InstanceType: !Ref InstanceType
      KeyName: !Ref KeyName
      SubnetId: !Ref PublicSubnet
      SecurityGroupIds:
        - !Ref SecurityGroup
      Tags:
        - Key: Name
          Value: !Sub '${AWS::StackName}-web-server'
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          yum update -y
          yum install -y httpd
          systemctl start httpd
          systemctl enable httpd
          echo "<h1>Hello from CloudFormation!</h1>" > /var/www/html/index.html

  WebServerEIP:
    Type: AWS::EC2::EIP
    Properties:
      Domain: vpc
      InstanceId: !Ref WebServerInstance

Mappings:
  AWSRegionArch2AMI:
    us-east-1:
      AMI: ami-0c55b159cbfafe1f0
    us-west-2:
      AMI: ami-0d1cd67c26f5fca19
    eu-west-1:
      AMI: ami-0233214e13e500f77

Outputs:
  VPCId:
    Description: VPC ID
    Value: !Ref VPC
    Export:
      Name: !Sub '${AWS::StackName}-vpc-id'

  PublicIP:
    Description: Public IP address of the web server
    Value: !Ref WebServerEIP

  InstanceId:
    Description: Instance ID of the web server
    Value: !Ref WebServerInstance

5.2 部署CloudFormation栈

# 验证模板
aws cloudformation validate-template --template-body file://template.yaml

# 创建栈
aws cloudformation create-stack \
  --stack-name my-web-stack \
  --template-body file://template.yaml \
  --parameters ParameterKey=KeyName,ParameterValue=my-key-pair \
  --capabilities CAPABILITY_IAM

# 查看栈状态
aws cloudformation describe-stacks --stack-name my-web-stack

# 查看栈事件
aws cloudformation describe-stack-events --stack-name my-web-stack

# 更新栈
aws cloudformation update-stack \
  --stack-name my-web-stack \
  --template-body file://template.yaml \
  --parameters ParameterKey=InstanceType,ParameterValue=t2.small

# 删除栈
aws cloudformation delete-stack --stack-name my-web-stack

6. IaC最佳实践

6.1 代码组织结构

terraform/
├── environments/
│   ├── dev/
│   │   ├── backend.tf
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── staging/
│   │   ├── backend.tf
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── prod/
│       ├── backend.tf
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── ec2/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── rds/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
└── scripts/
    ├── validate.sh
    └── deploy.sh

6.2 版本控制策略

# .gitignore
.terraform/
*.tfstate
*.tfstate.*
.terraform.lock.hcl
*.tfvars
!example.tfvars
crash.log
crash.*.log
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# 分支策略
# main - 生产环境
# staging - 预发布环境
# develop - 开发环境
# feature/* - 功能分支
# hotfix/* - 紧急修复分支

6.3 安全最佳实践

# 使用敏感变量
variable "db_password" {
  description = "Database password"
  type        = string
  sensitive   = true
}

# 从环境变量读取
variable "aws_access_key" {
  type      = string
  sensitive = true
  default   = env("AWS_ACCESS_KEY_ID")
}

# 使用KMS加密
data "aws_kms_secrets" "example" {
  secret {
    name    = "db_password"
    payload = "AQECAH..."
  }
}

# 限制资源权限
resource "aws_iam_role" "ec2_role" {
  name = "ec2-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "s3_access" {
  name = "s3-access"
  role = aws_iam_role.ec2_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:GetObject",
          "s3:PutObject"
        ]
        Resource = "arn:aws:s3:::my-bucket/*"
      }
    ]
  })
}

实用案例分析

案例1:使用Terraform部署多环境架构

场景描述

使用Terraform模块化设计,为开发、测试和生产环境部署一致的云基础设施架构。

实施步骤

  1. 创建VPC模块
# modules/vpc/variables.tf
variable "name" {
  description = "Name of the VPC"
  type        = string
}

variable "cidr" {
  description = "CIDR block for VPC"
  type        = string
}

variable "public_subnet_cidrs" {
  description = "CIDR blocks for public subnets"
  type        = list(string)
}

variable "private_subnet_cidrs" {
  description = "CIDR blocks for private subnets"
  type        = list(string)
}

variable "availability_zones" {
  description = "Availability zones"
  type        = list(string)
}

variable "enable_nat_gateway" {
  description = "Enable NAT gateway"
  type        = bool
  default     = true
}

# modules/vpc/main.tf
resource "aws_vpc" "main" {
  cidr_block           = var.cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "${var.name}-vpc"
  }
}

resource "aws_subnet" "public" {
  count                   = length(var.public_subnet_cidrs)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = var.public_subnet_cidrs[count.index]
  map_public_ip_on_launch = true
  availability_zone       = var.availability_zones[count.index]

  tags = {
    Name = "${var.name}-public-subnet-${count.index + 1}"
    Type = "public"
  }
}

resource "aws_subnet" "private" {
  count             = length(var.private_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnet_cidrs[count.index]
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name = "${var.name}-private-subnet-${count.index + 1}"
    Type = "private"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${var.name}-igw"
  }
}

resource "aws_eip" "nat" {
  count  = var.enable_nat_gateway ? 1 : 0
  domain = "vpc"

  tags = {
    Name = "${var.name}-nat-eip"
  }

  depends_on = [aws_internet_gateway.main]
}

resource "aws_nat_gateway" "main" {
  count         = var.enable_nat_gateway ? 1 : 0
  allocation_id = aws_eip.nat[0].id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.name}-nat"
  }

  depends_on = [aws_internet_gateway.main]
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }

  tags = {
    Name = "${var.name}-public-rt"
  }
}

resource "aws_route_table" "private" {
  count  = var.enable_nat_gateway ? 1 : 0
  vpc_id = aws_vpc.main.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.main[0].id
  }

  tags = {
    Name = "${var.name}-private-rt"
  }
}

resource "aws_route_table_association" "public" {
  count          = length(var.public_subnet_cidrs)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "private" {
  count          = var.enable_nat_gateway ? length(var.private_subnet_cidrs) : 0
  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private[0].id
}

# modules/vpc/outputs.tf
output "vpc_id" {
  description = "ID of the VPC"
  value       = aws_vpc.main.id
}

output "public_subnet_ids" {
  description = "IDs of public subnets"
  value       = aws_subnet.public[*].id
}

output "private_subnet_ids" {
  description = "IDs of private subnets"
  value       = aws_subnet.private[*].id
}

output "public_route_table_id" {
  description = "ID of public route table"
  value       = aws_route_table.public.id
}

output "private_route_table_id" {
  description = "ID of private route table"
  value       = var.enable_nat_gateway ? aws_route_table.private[0].id : null
}
  1. 创建EC2模块
# modules/ec2/variables.tf
variable "name" {
  description = "Name of the instance"
  type        = string
}

variable "ami" {
  description = "AMI ID"
  type        = string
}

variable "instance_type" {
  description = "Instance type"
  type        = string
}

variable "subnet_id" {
  description = "Subnet ID"
  type        = string
}

variable "security_group_ids" {
  description = "Security group IDs"
  type        = list(string)
}

variable "key_name" {
  description = "Key pair name"
  type        = string
}

variable "user_data" {
  description = "User data script"
  type        = string
  default     = null
}

variable "associate_public_ip_address" {
  description = "Associate public IP address"
  type        = bool
  default     = false
}

# modules/ec2/main.tf
resource "aws_instance" "main" {
  ami                         = var.ami
  instance_type               = var.instance_type
  subnet_id                   = var.subnet_id
  vpc_security_group_ids      = var.security_group_ids
  key_name                    = var.key_name
  user_data                   = var.user_data
  associate_public_ip_address = var.associate_public_ip_address

  tags = {
    Name = var.name
  }
}

# modules/ec2/outputs.tf
output "instance_id" {
  description = "ID of the instance"
  value       = aws_instance.main.id
}

output "public_ip" {
  description = "Public IP address"
  value       = aws_instance.main.public_ip
}

output "private_ip" {
  description = "Private IP address"
  value       = aws_instance.main.private_ip
}
  1. 创建环境配置
# environments/dev/main.tf
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "terraform-state-dev"
    key            = "dev/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks-dev"
  }
}

provider "aws" {
  region = var.aws_region
}

module "vpc" {
  source = "../../modules/vpc"

  name                  = "dev"
  cidr                  = "10.0.0.0/16"
  public_subnet_cidrs    = ["10.0.1.0/24", "10.0.2.0/24"]
  private_subnet_cidrs   = ["10.0.10.0/24", "10.0.11.0/24"]
  availability_zones     = ["us-east-1a", "us-east-1b"]
  enable_nat_gateway    = false
}

module "web_server" {
  source = "../../modules/ec2"

  name                   = "dev-web-server"
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = "t2.micro"
  subnet_id              = module.vpc.public_subnet_ids[0]
  security_group_ids     = [aws_security_group.web.id]
  key_name               = var.key_name
  user_data              = data.template_file.user_data.rendered
  associate_public_ip_address = true
}

# environments/dev/variables.tf
variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "key_name" {
  description = "SSH key pair name"
  type        = string
}

# environments/dev/outputs.tf
output "vpc_id" {
  value = module.vpc.vpc_id
}

output "web_server_public_ip" {
  value = module.web_server.public_ip
}
  1. 部署脚本
# scripts/deploy.sh
#!/bin/bash

set -e

ENVIRONMENT=$1

if [ -z "$ENVIRONMENT" ]; then
  echo "Usage: $0 <environment>"
  exit 1
fi

cd environments/$ENVIRONMENT

echo "Initializing Terraform..."
terraform init

echo "Formatting Terraform code..."
terraform fmt -recursive

echo "Validating Terraform configuration..."
terraform validate

echo "Planning Terraform changes..."
terraform plan -out=tfplan

echo "Applying Terraform changes..."
terraform apply tfplan

echo "Deployment completed successfully!"

案例2:使用Ansible自动化应用部署

场景描述

使用Ansible Playbook自动化部署一个三层Web应用(前端、后端、数据库)到多台服务器。

实施步骤

  1. 创建Ansible配置
# ansible.cfg
[defaults]
inventory = ./inventory
host_key_checking = False
remote_user = ubuntu
private_key_file = ~/.ssh/my-key.pem
timeout = 30
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 86400

[privilege_escalation]
become = True
become_method = sudo
become_user = root

[ssh_connection]
pipelining = True
  1. 创建主机清单
# inventory
[frontend]
frontend-1 ansible_host=192.168.1.10
frontend-2 ansible_host=192.168.1.11

[backend]
backend-1 ansible_host=192.168.1.20
backend-2 ansible_host=192.168.1.21

[database]
db-master ansible_host=192.168.1.30
db-slave ansible_host=192.168.1.31

[all:vars]
ansible_python_interpreter=/usr/bin/python3
app_version=1.0.0
db_name=myapp
db_user=myapp
db_password=changeme
  1. 创建数据库Role
# roles/database/tasks/main.yml
---
- name: Install PostgreSQL
  apt:
    name:
      - postgresql
      - postgresql-contrib
      - python3-psycopg2
    state: present

- name: Start and enable PostgreSQL
  service:
    name: postgresql
    state: started
    enabled: yes

- name: Create database
  become_user: postgres
  postgresql_db:
    name: "{{ db_name }}"
    state: present

- name: Create database user
  become_user: postgres
  postgresql_user:
    db: "{{ db_name }}"
    name: "{{ db_user }}"
    password: "{{ db_password }}"
    priv: "ALL"
    state: present

- name: Configure PostgreSQL
  template:
    src: postgresql.conf.j2
    dest: /etc/postgresql/14/main/postgresql.conf
    owner: postgres
    group: postgres
    mode: '0644'
  notify: Restart PostgreSQL

- name: Configure pg_hba.conf
  template:
    src: pg_hba.conf.j2
    dest: /etc/postgresql/14/main/pg_hba.conf
    owner: postgres
    group: postgres
    mode: '0640'
  notify: Restart PostgreSQL

# roles/database/handlers/main.yml
---
- name: Restart PostgreSQL
  service:
    name: postgresql
    state: restarted
  1. 创建后端Role
# roles/backend/tasks/main.yml
---
- name: Install Node.js
  shell: |
    curl -fsSL https://deb.nodesource.com/setup_18.x | bash -
    apt-get install -y nodejs
  args:
    creates: /usr/bin/node

- name: Create application user
  user:
    name: myapp
    system: yes
    shell: /bin/bash
    home: /opt/myapp

- name: Create application directory
  file:
    path: /opt/myapp
    state: directory
    owner: myapp
    group: myapp
    mode: '0755'

- name: Deploy application
  git:
    repo: https://github.com/username/myapp-backend.git
    dest: /opt/myapp
    version: "{{ app_version }}"
    force: yes
  become_user: myapp

- name: Install dependencies
  npm:
    path: /opt/myapp
    state: present
  become_user: myapp

- name: Create environment file
  template:
    src: .env.j2
    dest: /opt/myapp/.env
    owner: myapp
    group: myapp
    mode: '0600'
  notify: Restart backend

- name: Create systemd service
  template:
    src: myapp.service.j2
    dest: /etc/systemd/system/myapp.service
    owner: root
    group: root
    mode: '0644'
  notify: Restart backend

- name: Start and enable backend service
  systemd:
    name: myapp
    state: started
    enabled: yes
    daemon_reload: yes

# roles/backend/handlers/main.yml
---
- name: Restart backend
  systemd:
    name: myapp
    state: restarted
    daemon_reload: yes
  1. 创建前端Role
# roles/frontend/tasks/main.yml
---
- name: Install Nginx
  apt:
    name: nginx
    state: present

- name: Create application directory
  file:
    path: /var/www/myapp
    state: directory
    owner: www-data
    group: www-data
    mode: '0755'

- name: Deploy application
  git:
    repo: https://github.com/username/myapp-frontend.git
    dest: /var/www/myapp
    version: "{{ app_version }}"
    force: yes

- name: Build application
  shell: |
    npm install
    npm run build
  args:
    chdir: /var/www/myapp
    creates: /var/www/myapp/dist

- name: Configure Nginx
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/sites-available/myapp
    owner: root
    group: root
    mode: '0644'
  notify: Restart Nginx

- name: Enable site
  file:
    src: /etc/nginx/sites-available/myapp
    dest: /etc/nginx/sites-enabled/myapp
    state: link
  notify: Restart Nginx

- name: Remove default site
  file:
    path: /etc/nginx/sites-enabled/default
    state: absent
  notify: Restart Nginx

- name: Start and enable Nginx
  service:
    name: nginx
    state: started
    enabled: yes

# roles/frontend/handlers/main.yml
---
- name: Restart Nginx
  service:
    name: nginx
    state: restarted
  1. 创建主Playbook
# site.yml
---
- name: Deploy database servers
  hosts: database
  become: yes
  roles:
    - role: database
      when: inventory_hostname in groups['database']

- name: Deploy backend servers
  hosts: backend
  become: yes
  roles:
    - role: backend

- name: Deploy frontend servers
  hosts: frontend
  become: yes
  roles:
    - role: frontend
  1. 执行部署
# 语法检查
ansible-playbook site.yml --syntax-check

# 列出任务
ansible-playbook site.yml --list-tasks

# 检查模式
ansible-playbook site.yml --check

# 执行部署
ansible-playbook site.yml

# 指定标签执行
ansible-playbook site.yml --tags database

# 跳过标签执行
ansible-playbook site.yml --skip-tags frontend

# 并行执行
ansible-playbook site.yml -f 10

课后练习

  1. 基础练习

    • 使用Terraform创建一个简单的EC2实例
    • 使用Ansible在远程服务器上安装Nginx
    • 创建一个CloudFormation模板部署VPC
  2. 进阶练习

    • 使用Terraform模块化设计部署多环境架构
    • 使用Ansible Roles组织配置管理
    • 实现Terraform和Ansible的集成
  3. 挑战练习

    • 设计一个完整的CI/CD流水线,集成IaC
    • 实现基础设施的自动化测试
    • 建立多环境的基础设施管理流程
  4. 思考问题

    • 如何选择合适的IaC工具?
    • 如何确保IaC的安全性?
    • 如何处理基础设施的漂移问题?

总结

本集详细介绍了Linux系统中基础设施即代码的概念和实现方法,包括Terraform、Ansible、CloudFormation等工具的使用,以及IaC的最佳实践和持续集成等内容。通过本集的学习,您应该能够:

  • 理解基础设施即代码的概念和优势
  • 掌握Terraform的基本语法和高级特性
  • 熟悉Ansible的配置管理和自动化部署
  • 学习CloudFormation等云原生IaC工具
  • 能够设计和实施IaC的最佳实践

基础设施即代码是现代DevOps实践的核心,它将基础设施管理从手动操作转变为可编程、可重复、可测试的自动化流程。在实际项目中,应根据团队技能、云平台选择和项目需求选择合适的IaC工具,并建立完善的版本控制、测试和部署流程,以确保基础设施的稳定性和可靠性。

« 上一篇 容器编排 下一篇 » 云存储集成