When you start a Pod on a cluster, it is scheduled on a specific node of the cluster. If the node, at a given moment, is not able to continue to host this Pod, the Pod will not be restarted on a new node – the application is not self-healing.
Let’s have a try, on a cluster with more than one worker (e.g., on the cluster installed in Chapter 1).
Here, the Pod has been scheduled on the node worker-0.
Controller to the Rescue
We have seen in Chapter 5, section “Pod Controllers,” that using Pod controllers ensures your Pod is scheduled in another node if one node stops to work.
This time, we can see that a Pod has been recreated in another node of the cluster – our app now survives a node eviction.
Liveness Probes
It is possible to define a liveness probe for each container of a Pod. If the kubelet is not able to execute the probe successfully a given number of times, the container is considered not healthy and is restarted into the same Pod.
This probe should be used to detect that the container is not responsive.
Make an HTTP request.
If your container is an HTTP server, you can add an endpoint that always replies with a success response and define the probe with this endpoint. If your backend is not healthy anymore, it is probable that this endpoint will not respond either.
Execute a command.
Most server applications have an associate CLI application. You can use this CLI to execute a very simple operation on the server. If the server is not healthy, it is probable it will not respond to this simple request either.
Make a TCP connection.
When a server running in a container communicates via a non-HTTP protocol (on top of TCP), you can try to open a socket to the application. If the server is not healthy, it is probable that it will not respond to this connection request.
You have to use the declarative form to declare liveness probes.
A Note About Readiness Probes
Note that it is also possible to define a readiness probe for a container. The main role of the readiness probe is to indicate if a Pod is ready to serve network requests. The Pod will be added to the list of backends of matching Services when the readiness probe succeeds.
Later, during the container execution, if a readiness probe fails, the Pod will be removed from the list of backends of Services. This can be useful to detect that a container is not able to handle more connections (e.g., if it is already treating a lot of connections) and stop sending new ones.
HTTP Request Liveness Probe
Here, we define a probe that queries the /healthz endpoint. As nginx is not configured by default to reply to this path, it will reply with a 404 response code, and the probe will fail. This is not a real case, but that simulates an nginx server that would reply in error to a simple request.
Command Liveness Probe
Here, the liveness probe tries to connect to the server using the psql command and execute a very simple SQL query (SELECT 1) as user unknownUser. As this user does not exist, the query will fail.
TCP Connection Liveness Probe
Here, the liveness probe tries to connect to the container on the 5433 port. As postgres listens on the port 5432, the connection will fail.
Resource Limits and Quality of Service (QoS) Classes
You can define for each container of Pods resource (CPU and memory) requests and limits.
The resource requests values are used to schedule a Pod in a node having at least the requested resources available (see Chapter 9, section “Resource Requests”).
If you do not declare limits, each container will still have access to all the resources of the node; in this case, if some Pods are not using all their requested resources at a given time, some other containers will be able to use them and vice versa.
In contrast, if a limit is declared for a container, the container will be constrained to those particular resources. If it tries to allocate more memory than its limit, it will get a memory allocation error and will probably crash or work on a degraded mode; and it will have access to the CPU in its limit only.
If all containers of a Pod have declared requests and limits for all resources (CPU and memory) and the limits equal the requests, the Pod will be running with a Guaranteed QoS class.
Or if at least one container of a Pod has a resource request or limit, the Pod will be running with a Burstable QoS class.
Otherwise, if no request nor limit is declared for its containers, the Pod will be running with a Best Effort QoS class.
If a node runs out of an incompressible resource (memory), the associated kubelet can decide to eject one or more Pods, to prevent total starvation of the resource.
The Pods evicted are decided depending on their Quality of Service class: first Best Effort ones, then Burstable ones, and finally Guaranteed ones.