Cilium has an option to configure policy enforcement mode .
The default mode is not restrictive enough, we want to block all traffic unless it is explicitly allowed.
With Cilium installed, the cluster network should now be ready.
As we don’t have any network policy configured yet, pods requiring communications should be in errors though.
After a while, cluster nodes should become Ready
:
$ kubectl get nodeNAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane,master 9m20s v1.23.3
kind-control-plane2 Ready control-plane,master 9m8s v1.23.3
kind-control-plane3 Ready control-plane,master 8m16s v1.23.3
kind-worker Ready <none> 7m59s v1.23.3
kind-worker2 Ready <none> 7m59s v1.23.3
kind-worker3 Ready <none> 7m59s v1.23.3
A couple of critical pods should exhibit some errors though:
$ kubectl get pod -A -o wideNAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
.....
.....
kube-system coredns-64897985d-s78wv 0/1 Running 0 11m 10.244.5.77 kind-worker3 <none> <none>
kube-system coredns-64897985d-vrz8c 0/1 Running 0 11m 10.244.5.70 kind-worker3 <none> <none>
.....
.....
local-path-storage local-path-provisioner-5ddd94ff66-prh4p 0/1 Error 2 (47s ago) 11m 10.244.5.166 kind-worker3 <none> <none>
From the list above, core-dns
pods are not getting Ready
and local-path-provisioner
is in Error
.
This is because those pods need to talk with the api server, but there is no network policy that allows such communications.
Basically all pods running with hostNetwork
will be fine, but those without will be in troubles if they need to communicate with another pod and don’t have a network policy that allows the traffic.
In order to resolve the core-dns
pods issue, we will add a CiliumNetworkPolicy
to allow core-dns
pods to talk to the api server:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: core-dns
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: coredns
egress:
- toEntities:
- kube-apiserver
EOF
With this policy, core-dns
pods should reach the Ready
state.
There are two interesting things to note about the policy above:
core-dns
pods are matched using their service account, we could have used pod labels too but matching with a service account is a great Cilium
feature. Service accounts kind of represents an identity, regardless of the pods that use it. Being able to express policies using identities makes them easier to write and understand.toEntities
in the egress
rule uses predefined targets managed directly by Cilium
. This allows to easily whitelist well know target like the api server.In the same spirit as core-dns
pods, local-path-provisioner
pods need to talk with the api server.
We can apply almost the same policy to fix the issue:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: local-path-provisioner
namespace: local-path-storage
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: local-path-provisioner-service-account
egress:
- toEntities:
- kube-apiserver
EOF
Once the policy above created, local-path-provisioner
pods should run without errors.
Unfortunately yes. We have other pods not running within hostNetwork
that require communication:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: hubble-relay
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-relay
egress:
- toEntities:
- host
- remote-node
EOF
This time, we allow Hubble Relay
to talk with cluster nodes, this is because cilium agents run in hostNetwork
and therefore Hubble Relay
needs to be allowed to talk to remote-node
and host
entities.
For this one, we will try to connect to Hubble UI with our browser.
First, run a port-forward command to the Hubble UI service:
kubectl port-forward -n kube-system svc/hubble-ui 8888:80
Then browse http://localhost:8888/ and see what happens:
We are able to communicate with Hubble UI but no namespaces are visible in the list.
This makes sense as Hubble UI needs to talk to the api server to fetch the namespace list.
Let’s allow Hubble UI pods to talk with the api server:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: hubble-ui
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- kube-apiserver
EOF
This looks better, we now have the namespace list working:
Unfortunately this is not enough, trying to dig in a namespace will not work because Hubble UI is not allowed to communicate with <strong>Hubble Relay</strong> and therefore cannot retrieve flows.
We need to allow communications between Hubble UI and Hubble Relay :
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: hubble-ui
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- kube-apiserver
- toEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-relay
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: hubble-relay
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-relay
ingress:
- fromEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- host
- remote-node
EOF
Note that we need to modify two policies here:
egress
ruleingress
ruleThat is, communication has to been authorized on both sides.
Unfortunately, this still doesn’t work. Looking at the Hubble UI pod logs will reveal that it cannot resolve the Hubble Relay service:
level=error msg="hubble status checker: failed to connect to hubble-relay: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: lookup hubble-relay on 10.96.0.10:53: read udp 10.244.5.4:55406->10.96.0.10:53: i/o timeout\"\n" subsys=ui-backend
Once again, this makes sense as the <strong>Hubble UI</strong> pods need to resolve the IP address of the <strong>Hubble Relay</strong> service and that involves communication. This communication was not whitelisted though and observing an error is completely expected.
To fix this error we need to add another policy to allow egress
from Hubble UI
pods to core-dns
pods and allow ingress
in core-dns
pods from Hubble UI
pods:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: hubble-ui
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- kube-apiserver
- toEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-relay
- matchLabels:
io.cilium.k8s.policy.serviceaccount: coredns
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: core-dns
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: coredns
ingress:
- fromEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- kube-apiserver
EOF
Finally, Hubble UI should now display flows correctly 🎉
Now Hubble UI
works, we can observe that some communication is still blocked. The communication between core-dns
pods and the world
entity is not allowed.
We can easily fix this by adding the world
entity to the egress
whitelist of core-dns
pods:
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: core-dns
namespace: kube-system
spec:
endpointSelector:
matchLabels:
io.cilium.k8s.policy.serviceaccount: coredns
ingress:
- fromEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: hubble-ui
egress:
- toEntities:
- kube-apiserver
- world
EOF
Now we should not observe unauthorized communications anymore as all legitimate traffic has been whitelisted.
In the end, configuring network policies in a cluster is HARD. In this article we did it for a small number of workloads so doing it at a larger scale definitely takes some work.
Luckily, CiliumNetworkPolicy
simplifies a couple of things with entities and using service accounts to describe rules certainly helps, it stays complex by nature though.
On the other hand it is worth the effort as it vastly improves security in the cluster.
Although it’s a good start, rules could be even more restrictive by:
hostNetwork
This leaves room for other articles 😏
Please note that Cilium network policies support more options to configure ingress and egress rules that were not covered in this article.