In this article we will discuss about, how to drain node in Kubernetes cluster. For maintenance activity such as patching etc. requires downtime. In order to carry out such activities without impacting applications running within cluster, we need to follow some step by step command. We will discuss those commands in this article for impact less maintenance.
How to drain node in Kubernetes cluster?
Pre-requisite:
We have got Kubernetes cluster with one master node and two worker node as below:
$ kubectl get node NAME STATUS ROLES AGE VERSION k8s-control Ready master 167m v1.18.5 k8s-worker1 Ready <none> 166m v1.18.5 k8s-worker2 Ready <none> 166m v1.18.5 $
Now lets run one pod and one deployment with following YAML definition as below for the testing purpose:
Pod Definition:
$ cat pod.yaml apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: nginx image: nginx ports: - containerPort: 80
Deployment definition:
$ cat deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: name: my-deployment labels: app: my-deployment spec: replicas: 2 selector: matchLabels: app: my-deployment template: metadata: labels: app: my-deployment spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80
Now lets run both the definition file for creating the objects:
$ kubectl create -f pod.yaml pod/my-pod created $ kubectl create -f deploy.yaml deployment.apps/my-deployment created $
Cross verify the nodes on which Pod and the deployment Pods are running using below command:
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-deployment-655f58b4cb-2sn4r 1/1 Running 0 25s 192.168.194.70 k8s-worker1 <none> <none> my-deployment-655f58b4cb-t4s7t 1/1 Running 0 25s 192.168.126.6 k8s-worker2 <none> <none> my-pod 1/1 Running 0 5m13s 192.168.126.5 k8s-worker2 <none> <none> $
You can also see from the above output that, single Pod is running on the worker node “k8s-worker2” and the deployment pods are running on the both the nodes of Kubernetes cluster.
Draining the node?
Now lets drain the node “k8s-worker2” since its running Pod and the deployment Pods also.
$ kubectl drain k8s-worker2 --ignore-daemonsets --force node/k8s-worker2 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-svhvv, kube-system/kube-proxy-5nk2w; deleting Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: default/my-pod evicting pod default/my-deployment-655f58b4cb-hkdfh evicting pod default/my-pod pod/my-pod evicted pod/my-deployment-655f58b4cb-hkdfh evicted node/k8s-worker2 evicted $
Note:
Make sure you use the option “–ignore-daemonsets” and “–force” in case you have standalone Pod and deployment running within your Kubernetes cluster other wise system wont allow you to drain the node and it throws following error:
$ kubectl drain k8s-worker2 node/k8s-worker2 cordoned error: unable to drain node "k8s-worker2", aborting command... There are pending nodes to be drained: k8s-worker2 cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-svhvv, kube-system/kube-proxy-5nk2w cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): default/my-pod $
Once you drain the node successfully check the status of the node using below command:
$ kubectl get node NAME STATUS ROLES AGE VERSION k8s-control Ready master 172m v1.18.5 k8s-worker1 Ready <none> 171m v1.18.5 k8s-worker2 Ready,SchedulingDisabled <none> 171m v1.18.5 $
In the above output you can see that the node “k8s-worker2” has been marked as disabled for Scheduling the resource on it. Now lets check the status of our Pod and Deployment Pods after draining the node.
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-deployment-655f58b4cb-2sn4r 1/1 Running 0 116s 192.168.194.70 k8s-worker1 <none> <none> my-deployment-655f58b4cb-gtjt4 1/1 Running 0 20s 192.168.194.71 k8s-worker1 <none> <none> $
From the above output its cleared that, the standalone Pod “my-pod” being deleted due to Drain activity, however Pod that is being run using deployment got killed and rescheduled on the other node which is “k8s-worker1”.
Joining the maintenance node back again to the Kubernetes cluster:
Now lets assume that, our maintenance activity on the node “k8s-worker2” is completed and we want to join the node to our Kubernetes cluster. Then we need to uncordon the node as below:
$ kubectl uncordon k8s-worker2 node/k8s-worker2 uncordoned $
Now if you check the status of the node it will show Ready status and that to without “SchedulingDisabled” status as below:
$ kubectl get node NAME STATUS ROLES AGE VERSION k8s-control Ready master 174m v1.18.5 k8s-worker1 Ready <none> 174m v1.18.5 k8s-worker2 Ready <none> 174m v1.18.5 $
And also if you check the status of the deployment Pods its remains unchanged.
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-deployment-655f58b4cb-2sn4r 1/1 Running 0 3m37s 192.168.194.70 k8s-worker1 <none> <none> my-deployment-655f58b4cb-gtjt4 1/1 Running 0 2m1s 192.168.194.71 k8s-worker1 <none> <none> $
This states that, even if you join the maintenance node again back to Kubernetes cluster the earlier running Pods keeps running on the same node.
So this is all about question “How to drain node in Kubernetes?” for the maintenance activity.
In case you want to go through other Kubernetes articles follow this link.