EKS managed node group支持管理spot,默认的策略是 Capacity Optimized
并开启Capacity Rebalancing
,所有创建的节点都会添加eks.amazonaws.com/capacityType: SPOT
标签
我们先创建一个managed node group,在里面声明创建spot机器:
cat << EOF > add-mng-spot.yml
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
- name: mng-spot-4vcpu-8gb
desiredCapacity: 2
minSize: 1
maxSize: 4
spot: true
instanceTypes:
- c5.xlarge
- c5a.xlarge
- c5ad.xlarge
- c5d.xlarge
- c6a.xlarge
taints:
- key: spotInstance
value: "true"
effect: NoSchedule
labels:
intent: apps
managed-by: mng-spot
metadata:
name: ${EKS_CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
EOF
Now create the new EKS managed node group:
eksctl create nodegroup --config-file=add-mng-spot.yml
Creation of node groups will take 3-4 minutes.
There are a few things to note in the configuration that we just used to create these node groups.
ec2-instance-selector
spotInstance: "true:NoSchedule"
. NoSchedule is used to indicate we prefer pods not be scheduled on Spot Instances.If you are wondering at this stage: Where is spot bidding price ? you are missing some of the changes EC2 Spot Instances had since 2017. Since November 2017 EC2 Spot price changes infrequently based on long term supply and demand of spare capacity in each pool independently. You can still set up a maxPrice in scenarios where you want to set maximum budget. By default maxPrice is set to the On-Demand price; Regardless of what the maxPrice value, Spot Instances will still be charged at the current Spot market price.
Confirm that the new nodes joined the cluster correctly:
kubectl get nodes -l intent=apps -L eks.amazonaws.com/capacityType,eks.amazonaws.com/nodegroup
The output will show all of the nodes we have provisioned to run our applications:
NAME STATUS ROLES AGE VERSION CAPACITYTYPE NODEGROUP
ip-192-168-19-58.us-west-2.compute.internal Ready <none> 118m v1.22.12-eks-ba74326 SPOT mng-spot-4vcpu-8gb
ip-192-168-43-230.us-west-2.compute.internal Ready <none> 5d15h v1.22.12-eks-ba74326 ON_DEMAND mng-od-4vcpu-8gb
ip-192-168-70-31.us-west-2.compute.internal Ready <none> 5d15h v1.22.12-eks-ba74326 ON_DEMAND mng-od-4vcpu-8gb
ip-192-168-95-234.us-west-2.compute.internal Ready <none> 118m v1.22.12-eks-ba74326 SPOT mng-spot-4vcpu-8gb
Use the AWS Management Console to inspect the managed node groups deployed in your Kubernetes cluster.
处理spot中断不需要安装三方的工具(例如AWS Node Termination Handler
), managed node group
会按以下行为来处理中断:
默认开启Spot Capacity Rebalancing
,保证对线上应用影响时间最短。(参考: https://compute.kpingfan.com/02-asg/11.capacity-rebalacing/ )。当收到rebalance recommendation
,且新的spot节点状态ready后,EKS会先cordon原来的spot节点(unschedulable),再进行drain操作,把原来上面跑的pod驱赶到其他节点上。整个流程如下:
Now lets deploy a new workload that leverages the Spot capacity that we just added to the EKS cluster. The manifest below uses a nodeSelector
to ensure that it will only use nodes that offer Spot capacity:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: team5
labels:
app.kubernetes.io/created-by: eks-finhack
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: workload5
namespace: team5
labels:
app.kubernetes.io/created-by: eks-finhack
spec:
replicas: 3
selector:
matchLabels:
app: workload5
template:
metadata:
labels:
app: workload5
spec:
nodeSelector:
intent: apps
eks.amazonaws.com/capacityType: SPOT
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
resources:
requests:
cpu: "250m"
memory: 1Gi
tolerations:
- key: "spotInstance"
operator: "Equal"
value: "true"
effect: "NoSchedule"
EOF
Can you check which nodes these pods were scheduled on? You can use either kubectl
or kube-ops-view
.
There is one more thing that we’ve accomplished!
We have achieved a significant cost saving over On-Demand prices that we can apply in a controlled way and at scale. We hope this savings will help you try new experiments or build other cool projects. Now Go Build !