参考 https://medium.com/codex/establish-cilium-clustermesh-whelm-chart-11b08b0c995c
ClusterMesh, a solution of multi-cluster from Cilium, can provides a bunch of benefits for cross-cluster communications or networking traffics, such as:
\1. Pod IP routing across multiple Kubernetes clusters at native performance via tunneling or direct-routing without requiring any gateways or proxies.
\2. Transparent service discovery with standard Kubernetes services and coredns/kube-dns.
\3. Network policy enforcement spanning multiple clusters. Policies can be specified as Kubernetes NetworkPolicy resource or the extended CiliumNetworkPolicy CRD.
\4. Transparent encryption for all communication between nodes in the local cluster as well as across cluster boundaries.
The most attractive points for us are 3) and 4).
For point 3), it is pretty useful for multi-tenant network isolation in multi-cluster environments. We want to apply Pod network policy rules against targets located in other Kubernetes clusters. Cilium ClusterMesh would be a perfect solution here.
For point 4), it improves the security of service communications among the clusters. (Learn more details of Cilium ClusterMesh here )
Therefore, we want to build Cilium ClusterMesh in our multi-cluster environments and this post shows details of what I did to setup the clustermesh on AWS EKS clusters, especially with Helm Chart instead of Cilium CLI. Helm is mostly used in our IaC (terraform) and easy to be deployed with our pipeline.
There are some requirements of setting up Cilium ClusterMesh on two Kubernetes clusters, such as IP address connectivity between clusters, which means Node IPs (also Pod IPs if using Native-Routing mode) can communicate with each other across clusters.
Here are the entire list of prerequisites from Cilium documentation.
两个EKS集群先用vpc-peering或者tgw打通,这样彼此的pod或node可以互相ping通。另外两个EKS集群的网段也不能有重叠,例如:
10.103.0.0/16
10.104.0.0/16
下面命令会开启clustermesh,在集群中部署 clustermesh-apiserver
:
❯ cilium clustermesh enable --context $CLUSTER1
🔮 Auto-exposing service within AWS VPC (service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
🔑 Found CA in secret cilium-ca
🔑 Generating certificates for ClusterMesh...
✨ Deploying clustermesh-apiserver from quay.io/cilium/clustermesh-apiserver:v1.12.4...
✅ ClusterMesh enabled!
因为clustermesh-apiserver
是以Service形式部署,所以会创建一个AWS internal ELB:
❯ cilium clustermesh status --context $CLUSTER1
Hostname based ingress detected, trying to resolve it
Hostname resolved, using the found ip(s)
✅ Cluster access information is available:
- 10.103.27.163:2379
- 10.103.98.190:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
🔌 Cluster Connections:
🔀 Global services: [ min:0 / avg:0.0 / max:0 ]
在第二个集群上运行类似的命令。
运行以下命令:
❯ cilium clustermesh connect --context $CLUSTER1 --destination-context $CLUSTER2
✨ Extracting access information of cluster cluster2...
🔑 Extracting secrets from cluster cluster2...
Hostname based ingress detected, trying to resolve it
Hostname resolved, using the found ip(s)
ℹ️ Found ClusterMesh service IPs: [10.104.119.68 10.104.48.130]
✨ Extracting access information of cluster cluster1...
🔑 Extracting secrets from cluster cluster1...
Hostname based ingress detected, trying to resolve it
Hostname resolved, using the found ip(s)
ℹ️ Found ClusterMesh service IPs: [10.103.98.190 10.103.27.163]
✨ Connecting cluster arn:aws:eks:us-east-2:1234567890:cluster/cluster1 -> arn:aws:eks:us-east-2:1234567890:cluster/cluster2...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✨ Connecting cluster arn:aws:eks:us-east-2:1234567890:cluster/cluster2 -> arn:aws:eks:us-east-2:1234567890:cluster/cluster1...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✅ Connected cluster arn:aws:eks:us-east-2:1234567890:cluster/cluster1 and arn:aws:eks:us-east-2:1234567890:cluster/cluster2!
再次运行clustermesh status
命令:
❯ cilium clustermesh status --context $CLUSTER1
Hostname based ingress detected, trying to resolve it
Hostname resolved, using the found ip(s)
✅ Cluster access information is available:
- 10.103.27.163:2379
- 10.103.98.190:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
✅ All 4 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
🔌 Cluster Connections:
- cluster2: 4/4 configured, 4/4 connected
🔀 Global services: [ min:13 / avg:13.0 / max:13 ]
有一个新的cluster2
显示出来,在cluster2执行也是类似的输出:
❯ cilium clustermesh status --context $CLUSTER2
Hostname based ingress detected, trying to resolve it
Hostname resolved, using the found ip(s)
✅ Cluster access information is available:
- 10.104.119.68:2379
- 10.104.48.130:2379
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
✅ All 2 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
🔌 Cluster Connections:
- cluster1: 2/2 configured, 2/2 connected
🔀 Global services: [ min:14 / avg:14.0 / max:14 ]
Another verification can be done on the Cilium daemonset pods. Open a shell to a cilium pod and run cilium status — verbose
to display more detailed status.
There is a section for cluster health and you would see all the node from both clusters if everything is set up properly.
Cluster health: 6/6 reachable
Name IP Node Endpoints
cluster1/ip-10-103-77-148.us-east-2.compute.internal (localhost) 10.103.77.148 reachable reachable
cluster1/ip-10-103-0-33.us-east-2.compute.internal 10.103.0.33 reachable reachable
cluster1/ip-10-103-48-26.us-east-2.compute.internal 10.103.48.26 reachable reachable
cluster1/ip-10-103-90-232.us-east-2.compute.internal 10.103.90.232 reachable reachable
cluster2/ip-10-104-108-60.us-east-2.compute.internal 10.104.108.60 reachable reachable
cluster2/ip-10-104-28-22.us-east-2.compute.internal 10.104.28.22 reachable reachable
Another command we could run is cilium-health status
which shows the health status for each node from both clusters.
root@ip-10-103-77-148:/home/cilium# cilium-health status
Nodes:
cluster1/ip-10-103-77-148.us-east-2.compute.internal (localhost):
Host connectivity to 10.103.77.148:
ICMP to stack: OK, RTT=537.702µs
HTTP to agent: OK, RTT=101.574µs
Endpoint connectivity to 10.103.77.254:
ICMP to stack: OK, RTT=530.279µs
HTTP to agent: OK, RTT=206.954µs
cluster1/ip-10-103-0-33.us-east-2.compute.internal:
Host connectivity to 10.103.0.33:
ICMP to stack: OK, RTT=1.083102ms
HTTP to agent: OK, RTT=1.089925ms
Endpoint connectivity to 10.103.6.167:
ICMP to stack: OK, RTT=1.111456ms
HTTP to agent: OK, RTT=1.268708ms
cluster1/ip-10-103-48-26.us-east-2.compute.internal:
Host connectivity to 10.103.48.26:
ICMP to stack: OK, RTT=1.21498ms
HTTP to agent: OK, RTT=991.185µs
Endpoint connectivity to 10.103.40.104:
ICMP to stack: OK, RTT=1.220342ms
HTTP to agent: OK, RTT=1.265738ms
cluster1/ip-10-103-90-232.us-east-2.compute.internal:
Host connectivity to 10.103.90.232:
ICMP to stack: OK, RTT=1.007351ms
HTTP to agent: OK, RTT=475.152µs
Endpoint connectivity to 10.103.67.2:
ICMP to stack: OK, RTT=550.526µs
HTTP to agent: OK, RTT=444.213µs
cluster2/ip-10-104-108-60.us-east-2.compute.internal:
Host connectivity to 10.104.108.60:
ICMP to stack: OK, RTT=737.763µs
HTTP to agent: OK, RTT=617.232µs
Endpoint connectivity to 10.104.103.40:
ICMP to stack: OK, RTT=936.179µs
HTTP to agent: OK, RTT=549.053µs
cluster2/ip-10-104-28-22.us-east-2.compute.internal:
Host connectivity to 10.104.28.22:
ICMP to stack: OK, RTT=1.936628ms
HTTP to agent: OK, RTT=1.056997ms
Endpoint connectivity to 10.104.138.202:
ICMP to stack: OK, RTT=2.024877ms
HTTP to agent: OK, RTT=1.398727ms
Since the domain of an ELB can be used, how about a domain in an org or company, such as cluster1.mesh.my-comany.com
?
With external-dns
, it is as simple as an annotation for the clustermesh-apiserver
.
External-dns creates a domain record as specified in the annotation in Route 53 for the ELB automatically so cilium agents can use it to access the remote clustermesh-apiserver
too.