Running a stateful Pod

Let's see another use case. We used Deployments/ReplicaSets to replicate the Pods. It scales well and is easy to maintain and Kubernetes assigns a DNS to the Pod using the Pod's IP address, such as <Pod IP address>.<namespace>.pod.cluster.local.

The following example demonstrates how the Pod DNS will be assigned:

$ kubectl run apache2 --image=httpd --replicas=3
deployment "apache2" created


//one of Pod has an IP address as 10.52.1.8
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
apache2-55c684c66b-7m5zq 1/1 Running 0 5s 10.52.1.8 gke-chap7-default-pool-64212da9-z96q
apache2-55c684c66b-cjkcz 1/1 Running 0 1m 10.52.0.7 gke-chap7-default-pool-64212da9-8gzm
apache2-55c684c66b-v78tq 1/1 Running 0 1m 10.52.2.5 gke-chap7-default-pool-64212da9-bbs6


//another Pod can reach to
10-52-1-8.default.pod.cluster.local
$ kubectl exec apache2-55c684c66b-cjkcz -- ping -c 2 10-52-1-8.default.pod.cluster.local
PING 10-52-1-8.default.pod.cluster.local (10.52.1.8): 56 data bytes
64 bytes from 10.52.1.8: icmp_seq=0 ttl=62 time=1.642 ms
64 bytes from 10.52.1.8: icmp_seq=1 ttl=62 time=0.322 ms
--- 10-52-1-8.default.pod.cluster.local ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.322/0.982/1.642/0.660 ms

However, this DNS entry is not guaranteed to stay in use for this Pod, because the Pod might crash due to an application error or node resource shortage. In such a case, the IP address will possibly be changed:

$ kubectl delete pod apache2-55c684c66b-7m5zq
pod "apache2-55c684c66b-7m5zq" deleted


//Pod IP address has been changed to 10.52.0.7
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
apache2-55c684c66b-7m5zq 0/1 Terminating 0 1m <none> gke-chap7-default-pool-64212da9-z96q
apache2-55c684c66b-cjkcz 1/1 Running 0 2m 10.52.0.7 gke-chap7-default-pool-64212da9-8gzm
apache2-55c684c66b-l9vqt 1/1 Running 0 7s 10.52.1.9 gke-chap7-default-pool-64212da9-z96q
apache2-55c684c66b-v78tq 1/1 Running 0 2m 10.52.2.5 gke-chap7-default-pool-64212da9-bbs6


//DNS entry also changed
$ kubectl exec apache2-55c684c66b-cjkcz -- ping -c 2 10-52-1-8.default.pod.cluster.local
PING 10-52-1-8.default.pod.cluster.local (10.52.1.8): 56 data bytes
92 bytes from gke-chap7-default-pool-64212da9-z96q.c.kubernetes-cookbook.internal (192.168.2.4): Destination Host Unreachable
92 bytes from gke-chap7-default-pool-64212da9-z96q.c.kubernetes-cookbook.internal (192.168.2.4): Destination Host Unreachable
--- 10-52-1-8.default.pod.cluster.local ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

For some applications, this will cause an issue; for example, if you manage a cluster application that needs to be managed by DNS or IP address. As of the current Kubernetes implementation, IP addresses can't be preserved for Pods . How about we use Kubernetes Service? Service preserves a DNS name. Unfortunately, it's not realistic to create the same amount of service with Pod. In the previous case, create three Services that bind to three Pods one to one.

Kubernetes has a solution for this kind of use case that uses StatefulSet. It preserves not only the DNS but also the persistent volume to keep a bind to the same Pod. Even if Pod is crashed, StatefulSet guarantees the binding of the same DNS and persistent volume to the new Pod. Note that the IP address is not preserved due to the current Kubernetes implementation.

To demonstrate, use Hadoop Distributed File System (HDFS) to launch one NameNode and three DataNodes. To perform this, use a Docker image from https://hub.docker.com/r/uhopper/hadoop/ that has NameNode and DataNode images. In addition, borrow the YAML configuration files namenode.yaml and datanode.yaml from https://gist.github.com/polvi/34ef498a967de563dc4252a7bfb7d582 and change a little bit:

  1. Let's launch a Service and StatefulSet for namenode and datanode:
//create NameNode
$ kubectl create -f https://raw.githubusercontent.com/kubernetes-cookbook/second-edition/master/chapter3/3-4/namenode.yaml
service "hdfs-namenode-svc" created
statefulset "hdfs-namenode" created

$ kubectl get statefulset
NAME DESIRED CURRENT AGE
hdfs-namenode 1 1 19s

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hdfs-namenode-0 1/1 Running 0 26s


//create DataNodes
$ kubectl create -f https://raw.githubusercontent.com/kubernetes-cookbook/second-edition/master/chapter3/3-4/datanode.yaml
statefulset "hdfs-datanode" created

$ kubectl get statefulset
NAME DESIRED CURRENT AGE
hdfs-datanode 3 3 50s
hdfs-namenode 1 1 5m

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hdfs-datanode-0 1/1 Running 0 9m
hdfs-datanode-1 1/1 Running 0 9m
hdfs-datanode-2 1/1 Running 0 9m
hdfs-namenode-0 1/1 Running 0 9m

As you can see, the Pod naming convention is <StatefulSet-name>-<sequence number>. For example, NameNode Pod's name is hdfs-namenode-0. Also DataNode Pod's names are hdfs-datanode-0, hdfs-datanode-1 and hdfs-datanode-2

In addition, both NameNode and DataNode have a service that is configured as Headless mode (by spec.clusterIP: None). Therefore, you can access these Pods using DNS as <pod-name>.<service-name>.<namespace>.svc.cluster.local. In this case, this NameNode DNS entry could be hdfs-namenode-0.hdfs-namenode-svc.default.svc.cluster.local.

  1. Let's check what NameNode Pod's IP address is, you can get this using kubectl get pods -o wide as follows:
//Pod hdfs-namenode-0 has an IP address as 10.52.2.8
$ kubectl get pods hdfs-namenode-0 -o wide
NAME READY STATUS RESTARTS AGE IP NODE
hdfs-namenode-0 1/1 Running 0 9m 10.52.2.8 gke-chapter3-default-pool-97d2e17c-0dr5
  1. Next, log in (run /bin/bash) to one of the DataNodes using kubectl exec to resolve this DNS name and check whether the IP address is 10.52.2.8 or not:
$ kubectl exec hdfs-datanode-1 -it -- /bin/bash
root@hdfs-datanode-1:/#
root@hdfs-datanode-1:/# ping -c 1 hdfs-namenode-0.hdfs-namenode-svc.default.svc.cluster.local
PING hdfs-namenode-0.hdfs-namenode-svc.default.svc.cluster.local (10.52.2.8): 56 data bytes
...
...

Looks all good! For demonstration purposes, let's access the HDFS web console to see DataNode's status.

  1. To do that, use kubectl port-forward to access to the NameNode web port (tcp/50070):
//check the status by HDFS web console
$ kubectl port-forward hdfs-namenode-0 :50070
Forwarding from 127.0.0.1:60107 -> 50070
  1. The preceding result indicates that your local machine TCP port 60107 (you result will vary) has been forwarded to NameNode Pod TCP port 50070. Therefore, use a web browser to access http://127.0.0.1:60107/ as follows:
HDFS Web console shows three DataNodes

As you may see, three DataNodes have been registered to NameNode successfully. The DataNodes are also using the Headless Service so that same name convention assigns DNS names for DataNode as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.188.201