Kubernetes is a powerful and complex orchestration system for automating the deployment, scaling, and management of containerized applications. Among its many features, Kubernetes provides sophisticated scheduling mechanisms to ensure that workloads are placed on the most appropriate nodes. One key aspect of this scheduling system is the use of taints and tolerations. Understanding these concepts is crucial for managing node and pod behavior effectively.
What are Taints and Tolerations?
In Kubernetes, a taint is a property that can be applied to a node to repel certain pods from being scheduled on that node. Conversely, a toleration is a property applied to a pod that allows it to tolerate (or ignore) the taints on nodes, thus permitting the pod to be scheduled on nodes with specific taints.
Taints and tolerations work together to control which pods can be scheduled on which nodes. This mechanism ensures that workloads are placed in environments that meet their requirements and constraints, improving the efficiency and reliability of the cluster.
Getting Started with Taints
Assuming that you have multi-node Kubernetes cluster setup in your infrastructure, you can first clone the repository and start labelling the nodes as shown in the following steps:
Copied!git clone https://github.com/collabnix/dockerlabs cd dockerlabs/kubernetes/workshop/Scheduler101/
Start Labelling the nodes:
Copied!kubectl label nodes node2 role=dev kubectl label nodes node3 role=dev
Copied![node1 Scheduler101]$ kubectl taint nodes node2 role=dev:NoSchedule node/node2 tainted
Apply the changes:
Copied!kubectl apply -f pod-taint-node.yaml
Viewing Your Pods
Copied!kubectl get pods --output=wide
Get nodes label detail
Copied![node1 Scheduler101]$ kubectl get nodes --show-labels|grep mynode |grep role node2 Ready <none> 175m v1.14.9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux,mynode=worker-1,role=dev node3 Ready <none> 175m v1.14.9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node3,kubernetes.io/os=linux,mynode=worker-3,role=dev
Get Pod Description
Copied![node1 Scheduler101]$ kubectl describe pods nginx Name: nginx Namespace: default Priority: 0 PriorityClassName: <none> Node: node3/192.168.0.16 Start Time: Mon, 30 Dec 2019 19:13:45 +0000 Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"affinity":{"nodeAffinity":{"re... Status: Running IP: 10.36.0.1 Containers: nginx: Container ID: docker://57d032f4358be89e2fcad7536992b175503565af82ce4f66f4773f6feaf58356 Image: nginx Image ID: docker-pullable://nginx@sha256:b2d89d0a210398b4d1120b3e3a7672c16a4ba09c2c4a0395f18b9f7999b768f2 Port: <none> Host Port: <none> State: Running Started: Mon, 30 Dec 2019 19:14:45 +0000 Ready: True Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-qpgxq (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-qpgxq: Type: Secret (a volume populated by a Secret) SecretName: default-token-qpgxq Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 105s default-scheduler Successfully assigned default/nginx to node3 Normal Pulling 101s kubelet, node3 Pulling image "nginx" Normal Pulled 57s kubelet, node3 Successfully pulled image "nginx" Normal Created 47s kubelet, node3 Created container nginx Normal Started 45s kubelet, node3 Started container nginx
- Deployed pod on node3.
Step Cleanup
Finally you can clean up the resources you created in your cluster:
Copied!kubectl delete -f pod-tain-node.yaml
Tolerations
A toleration is a way of ignoring a taint during scheduling. Tolerations aren’t applied to nodes, but rather the pods. So, in the example above, if we apply a toleration to the PodSpec, we could “tolerate” the slow disks on that node and still use it.
Getting Started with Tolerations
Copied!git clone https://github.com/collabnix/dockerlabs cd dockerlabs/kubernetes/workshop/Scheduler101/ kubectl apply -f pod-tolerations-node.yaml
Viewing Your Pods
Copied!kubectl get pods --output=wide
Which Node Is This Pod Running On?
Copied![node1 Scheduler101]$ kubectl describe pods nginx Name: nginx Namespace: default Priority: 0 PriorityClassName: <none> Node: node3/192.168.0.16 Start Time: Mon, 30 Dec 2019 19:20:35 +0000 Labels: env=test Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"env":"test"},"name":"nginx","namespace":"default"},"spec":{"contai... Status: Pending IP: Containers: nginx: Container ID: Image: nginx:1.7.9 Image ID: Port: <none> Host Port: <none> State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-qpgxq (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-token-qpgxq: Type: Secret (a volume populated by a Secret) SecretName: default-token-qpgxq Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s role=dev:NoSchedule Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4s default-scheduler Successfully assigned default/nginx to node3 Normal Pulling 1s kubelet, node3 Pulling image "nginx:1.7.9
Step Cleanup
Finally you can clean up the resources you created in your cluster:
Copied!kubectl delete -f pod-tolerations-node.yaml
- An important thing to notice, though, is that tolerations may enable a tainted node to accept a pod but it does not guarantee that this pod runs on that specific node.
- In other words, the tainted node will be considered as one of the candidates for running our pod. However, if another node has a higher priority score, it will be chosen instead. For situations like this, you need to combine the toleration with nodeSelector or node affinity parameters.
Conclusion
Taints and tolerations are essential tools for fine-tuning the scheduling behavior of Kubernetes clusters. By understanding and effectively using these mechanisms, you can ensure that your workloads are placed on the most suitable nodes, improving the reliability, performance, and manageability of your applications.
As your Kubernetes environment grows in complexity, mastering taints and tolerations will become increasingly important. These tools offer a high degree of control over node and pod interactions, making them indispensable for any serious Kubernetes administrator or developer. Whether you’re isolating critical workloads, managing resource constraints, or performing maintenance, taints and tolerations provide the flexibility needed to maintain a robust and efficient cluster.
Leave a Reply