同VPC公网迁移到私网

步骤如下:

  1. 创建VPC,3个公网 + 3个私网

  2. 创建EKS集群,将节点组创建在公网

  3. 创建VPC Interface Endpoint

  4. 在私网创建新的节点组,将公网节点组负载迁移到私网

  5. 删除原来的公网节点组

创建VPC

创建vpc,(3个公网 + 3个私网),不安装NAT, 默认创建S3 Gateway Endpoint:

image-20231010165330505

image-20231009212139752

创建完成后,记录下三个公网和三个私网的subnet id。

在公网创建EKS节点组时,要开启auto-assign IPv4。所以要先给三个公有子网开启自动分配IP:

image-20231010190308926

创建EKS + public subnet nodegroup

创建EKS集群,并创建一个公网节点组(将vpc id和subnet id做对应替换):

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: eks-private-subnet
  region: us-east-1

vpc:
  id: "vpc-0fb2f18727d563b49"
  clusterEndpoints:
    publicAccess: true
    privateAccess: true
  subnets:
    private:
      private-1a:
          id: "subnet-0ccc71adb888fbb6e"
      private-1b:
          id: "subnet-06a427d7bcff200f8"
      private-1c:
          id: "subnet-0d729073422fe2952"
    public:
      public-1a:
          id: "subnet-0f11f8787de11fe89"
      public-1b:
          id: "subnet-0320b2a6239af84b1"
      public-1c:
          id: "subnet-0a082988cacbad513"

managedNodeGroups:
  - name: ng-1
    instanceType: m5.xlarge
    subnets:
      - public-1a
      - public-1b
    desiredCapacity: 2

创建集群:

eksctl create cluster -f cluster.yaml

创建完成后,部署一个ECR中的镜像(模拟生产应用):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tomcat
  labels:
    app: tomcat
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tomcat
  template:
    metadata:
      labels:
        app: tomcat
    spec:
      containers:
      - name: tomcat
        image: 145197526627.dkr.ecr.us-east-1.amazonaws.com/java-app
        imagePullPolicy: Always
        env:
        - name: ALLOW_EMPTY_PASSWORD
          value: "yes"
        ports:
        - containerPort: 80
          protocol: TCP

创建interface endpoints

创建interface endpoint时要指定security group,先提前创建一个,允许vpc网段的443端口访问(私网下面的EC2通过https访问endpoint服务):

image-20231010225214218

要创建的endpoint如下:

Service Endpoint
Amazon EC2 com.amazonaws.region-code.ec2
Amazon Elastic Container Registry (for pulling container images) com.amazonaws.region-code.ecr.api, com.amazonaws.region-code.ecr.dkr, and com.amazonaws.region-code.s3
Application Load Balancers and Network Load Balancers com.amazonaws.region-code.elasticloadbalancing
EC2 autoscalng com.amazonaws.region-code.autoscaling
Amazon CloudWatch Logs com.amazonaws.region-code.logs
AWS Security Token Service (required when using IAM roles for service accounts) com.amazonaws.region-code.sts

由于要创建多个endpoint,这里不在控制台重复操作。使用命令行创建:

aws ec2 create-vpc-endpoint --region us-east-1 \
    --vpc-id vpc-0fb2f18727d563b49 \
    --vpc-endpoint-type Interface \
    --service-name com.amazonaws.us-east-1.logs  \
    --subnet-ids subnet-0ccc71adb888fbb6e subnet-06a427d7bcff200f8 subnet-0d729073422fe2952 \
    --security-group-id sg-003f39fae9ec72745

subnet-ids为三个私网的id,security-group-id为上面创建出来的安全组的id。

分别将service-name替换为:

  • com.amazonaws.us-east-1.ecr.dkr
  • com.amazonaws.us-east-1.sts
  • com.amazonaws.us-east-1.ec2
  • com.amazonaws.us-east-1.elasticloadbalancing
  • com.amazonaws.us-east-1.ecr.api
  • com.amazonaws.us-east-1.autoscaling

重复执行上面命令六次,创建出对应的endpoint。在控制台检查创建出来的endpoint,大概1-2分钟生效:

image-20231010222143881

创建私网nodegroup

在原来的基础上增加一个私网nodegroup, 增加底部的7行:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: eks-private-subnet
  region: us-east-1

vpc:
  id: "vpc-0fb2f18727d563b49"
  clusterEndpoints:
    publicAccess: true
    privateAccess: true
  subnets:
    private:
      private-1a:
          id: "subnet-0ccc71adb888fbb6e"
      private-1b:
          id: "subnet-06a427d7bcff200f8"
      private-1c:
          id: "subnet-0d729073422fe2952"
    public:
      public-1a:
          id: "subnet-0f11f8787de11fe89"
      public-1b:
          id: "subnet-0320b2a6239af84b1"
      public-1c:
          id: "subnet-0a082988cacbad513"

managedNodeGroups:
  - name: ng-1
    instanceType: m5.xlarge
    subnets:
      - public-1a
      - public-1b
    desiredCapacity: 2

  - name: ng-2
    instanceType: m5.xlarge
    subnets:
      - private-1a
      - private-1b
    desiredCapacity: 2
    privateNetworking: true

执行:

eksctl create nodegroup -f cluster.yaml

创建节点组完成后,发现新增的两个节点并没有公网IP:

image-20231010225239523

将两个公网节点上的负载驱逐出来:

 $ kubectl drain --ignore-daemonsets --delete-emptydir-data ip-10-0-31-41.ec2.internal  

image-20231010225822713

此时生产负载部署到了私网节点组的节点上:

image-20231010225921309

将公网节点组删除:

eksctl delete nodegroup --cluster=<clusterName> --name=ng-1
确认只有私网节点存在:

image-20231010230401195