Chapter 5. Kubernetes Networking Abstractions

Previously, we have covered a swath of networking fundamentals, and how traffic in Kubernetes gets from A to B. In this chapter, we will discuss networking abstractions in Kubernetes, primarily service discovery and load balancing. Most notably, this is the chapter on Services and Ingresses. Both resources are notoriously complex, due to the large amount of options, as they attempt to solve numerous use cases. These are the most visible part of the Kubernetes network stack, as they define basic network characteristics of workloads on Kubernetes. This is where developers interact with the networking stack for their applications deployed on Kubernetes.

This chapter will cover fundamental examples of Kubernetes networking abstractions, and details on how they work. To follow along, you will need the following tools.

  • Docker

  • Kind

  • Linkerd

You will need to be familiar with the kubectl exec and docker exec commands. If you are not our code repo will have any and all the commands we discuss so don’t worry too much. We will also make use of ip and netns from Chapter 2 and 3. Note that most of these tools are for debugging and showing implementation details - you would not necessarily need them during normal operations.

Tip

kubectl will be a key tool in this chapter’s examples, and it is the standard for operators to interact with clusters and their networks. You should be familiar with kubectl create, apply, get, delete, and exec commands. You can read more at kubernetes.io/docs/reference/generated/kubectl/kubectl-commands, or by running kubectl [command] --help.

Docker, Kind, and Linkerd installs are available on their respective sites, and we’ve provided more information in the book’s code repository as well.

This chapter will explore these Kubernetes Networking Abstractions:

  • Statefulsets

  • Endpoints

    • Endpoint Slices

  • Services

    • NodePort

    • Cluster

    • Headless

    • External

    • LoadBalancer

  • Ingress

    • Ingress Controller

    • Ingress rules

  • Service Meshes

    • Linkerd

In order to explore these abstractions we will deploy these examples to our Kubernetes cluster with the following steps.

  1. Deploy Kind Cluster with ingress enabled

  2. Explore StatefulSets

  3. Deploy Kubernetes Services

  4. Deploy an Ingress Controller

  5. Deploying Linkerd Service Mesh

These abstractions are at the heart of what the Kubernetes API provides to Developers and administrators to programmatic control the flow of communications into and out of the cluster. Mastering understanding and deploying these abstractions is crucial for success of any workload inside a cluster. After working through these examples you will understand which abstractions to use for what situations for your applications.

With the kind cluster configuration yaml, we can use kind to create that cluster with the below command. If this is the first time running it, it will take some time to download all the docker images for the working and control plane docker images.

Note

The following examples assume that you still have the local kind cluster running from the previous chapter, along with the golang web server and the dnsutils images for testing.

StatefulSets

StatefulSets are a workload abstraction in Kubernetes, to manage pods very similar to a deployment. Unlike a deployment StatefulSets add the following features for applications that require them:

  • Stable, unique network identifiers.

  • Stable, persistent storage.

  • Ordered, graceful deployment and scaling.

  • Ordered, automated rolling updates.

The Deployment resource is better suited for applications that do not have these requirements (for example, a service which stores data in an external database).

Our Database for the Golang Minimal web server uses a statefulset. The database has a service, a configmap for the postgres username, password and test database name and a statefulset for the containers running postgres.

Let us deploy it now.

kubectl apply -f database.yaml
service/postgres created
configmap/postgres-config created
statefulset.apps/postgres created

Let us examine the DNS and network ramifications of using a statefulset.

To test dns inside the cluster we can use the dnsutils image, this image is gcr .io/kubernetes-e2e-test-images/dnsutils:1.3 and is used for k8s testing.

kubectl apply -f dnsutils.yaml

pod/dnsutils created

kubectl get pods
NAME       READY   STATUS    RESTARTS   AGE
dnsutils   1/1     Running   0          9s

With the replica configured with two pods, we see the statefulset deploy postgres-0 and postgres-1, in that order, a feature of statefulsets with IP address 10.244.1.3 and 10.244.2.3 respectively.

kubectl get pods -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE
dnsutils     1/1     Running   0          15m   10.244.3.2   kind-worker3
postgres-0   1/1     Running   0          15m   10.244.1.3   kind-worker2
postgres-1   1/1     Running   0          14m   10.244.2.3   kind-worker

Here is the name of our headless service, postgres, that the client can use for queries to return the endpoint IP addresses.

kubectl get svc postgres
NAME       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
postgres   ClusterIP                     <none>        5432/TCP   23m

Using our dnsutils image we can see that the DNS names for the statefulsets will return those IP Addresses along with the cluster IP of the postgres service.

kubectl exec dnsutils -- host postgres-0.postgres.default.svc.cluster.local.
postgres-0.postgres.default.svc.cluster.local has address 10.244.1.3

kubectl exec dnsutils -- host postgres-1.postgres.default.svc.cluster.local.
postgres-1.postgres.default.svc.cluster.local has address 10.244.2.3

kubectl exec dnsutils -- host postgres
postgres.default.svc.cluster.local has address 10.105.214.153

Statefulsets attempt to mimic a fixed group of persistent machines. As a generic solution for stateful workloads, specific behavior may be frustrating in specific use cases.

A common problem that users encounter is an update requiring manual intervention to fix. When using .spec .updateStrategy.type: RollingUpdate, and .spec.podManagementPolicy: OrderedReady, both of which are default settings. With these settings, a user must manually intervene if an updated pod never becomes ready.

Also, StatefulSets require a Service, preferable headless, to be responsible for the network identity of the Pods and end users are responsible for creating this Service.

Statefulsets have many configuration options, and many third party alternatives exist (both generic stateful workload controllers, and software-specific workload controllers).

Statefulsets offer functionality for a specific use case in Kubernetes. They should not be used for everyday application deployments. Later in this section we discuss more appropriate Networking abstractions for run-of-the-mill deployments.

In our next section we will explore Endpoints and Endpoints Slices, the backbone of Kubernetes services.

Endpoints

Endpoints help identity what pods are running for the service it powers. Endpoints are created and managed by Services. We will discuss services on their own later, to avoid covering too many new things at once. For now, let us just say that a service contains a standard label selector (introduced in Chapter 4), which defines which pods are in the Endpoints.

In Figure 5-1 we can see traffic being directed to an endpoint on node 2, pod 5.

Kubernetes Endpoints
Figure 5-1. Endpoints in a Service

Let us discuss how this Endpoint is created and maintained in the cluster.

Each endpoint contains a list of ports (which apply to all pods), and two lists of addresses: ready and unready.

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    name: demo-endpoints
subsets:
- addresses:
  - ip: 10.0.0.1
- notReadyAddresses:
  - ip: 10.0.0.2
  ports:
  - port: 8080
    protocol: TCP

Addresses are listed in .addresses if they are passing pod readiness checks. Addresses are listed in .notReadyAddresses if they are not. This makes endpoints a service discovery tool, where you can watch an endpoints object to see the health and addresses of all pods.

kubectl get endpoints clusterip-service
NAME                ENDPOINTS                                                     AGE
clusterip-service   10.244.1.5:8080,10.244.2.7:8080,10.244.2.8:8080 + 1 more...   135m

We can get a better view of all the Addresses with kubectl describe.

 kubectl describe endpoints clusterip-service
Name:         clusterip-service
Namespace:    default
Labels:       app=app
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-01-30T18:51:36Z
Subsets:
  Addresses:          10.244.1.5,10.244.2.7,10.244.2.8,10.244.3.9
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  8080  TCP

Events:
  Type     Reason                  Age   From                 Message
  ----     ------                  ----  ----                 -------

Let us remove the app label and see how Kubernetes responds. In a separate terminal run this command. This will allow us to see changes to the pods in real time

kubectl get pods -w

In another separate terminal let us do the same thing with endpoints.

kubectl get endpoints -w

We now need to get a pod name to remove from the endpoint object.

 kubectl get pods -l app=app -o wide
NAME                   READY   STATUS    RESTARTS   AGE   IP           NODE
app-5586fc9d77-7frts   1/1     Running   0          19m   10.244.1.5   kind-worker2
app-5586fc9d77-mxhgw   1/1     Running   0          19m   10.244.3.9   kind-worker3
app-5586fc9d77-qpxwk   1/1     Running   0          20m   10.244.2.7   kind-worker
app-5586fc9d77-tpz8q   1/1     Running   0          19m   10.244.2.8   kind-worker

With kubectl label we can alter the pod app-5586fc9d77-7frts app=app label.

 kubectl label pod app-5586fc9d77-7frts app=nope --overwrite
pod/app-5586fc9d77-7frts labeled

Both Watch commands on Endpoints and Pods will see some changes for the same reason, removal of the label on the pod. The Endpoint controller will notice a change to the pods with the label app=app and so did the Deployment controller. So Kubernetes did what Kubernetes does, it made the real state reflect the desired state.

kubectl get pods -w
NAME                   READY   STATUS    RESTARTS   AGE
app-5586fc9d77-7frts   1/1     Running   0          21m
app-5586fc9d77-mxhgw   1/1     Running   0          21m
app-5586fc9d77-qpxwk   1/1     Running   0          22m
app-5586fc9d77-tpz8q   1/1     Running   0          21m
dnsutils               1/1     Running   3          3h1m
postgres-0             1/1     Running   0          3h
postgres-1             1/1     Running   0          3h
app-5586fc9d77-7frts   1/1     Running   0          22m
app-5586fc9d77-7frts   1/1     Running   0          22m
app-5586fc9d77-6dcg2   0/1     Pending   0          0s
app-5586fc9d77-6dcg2   0/1     Pending   0          0s
app-5586fc9d77-6dcg2   0/1     ContainerCreating   0          0s
app-5586fc9d77-6dcg2   0/1     Running             0          2s
app-5586fc9d77-6dcg2   1/1     Running             0          7s

The deployment has four pods but our relabeled pod still exists app-5586fc9d77-7frts

kubectl get pods
NAME                   READY   STATUS    RESTARTS   AGE
app-5586fc9d77-6dcg2   1/1     Running   0          4m51s
app-5586fc9d77-7frts   1/1     Running   0          27m
app-5586fc9d77-mxhgw   1/1     Running   0          27m
app-5586fc9d77-qpxwk   1/1     Running   0          28m
app-5586fc9d77-tpz8q   1/1     Running   0          27m
dnsutils               1/1     Running   3          3h6m
postgres-0             1/1     Running   0          3h6m
postgres-1             1/1     Running   0          3h6m

The pod app-5586fc9d77-6dcg2 now is part of the deployment and endpoint object with IP address 10.244.1.6.

kubectl get pods app-5586fc9d77-6dcg2 -o wide
NAME                   READY   STATUS    RESTARTS   AGE    IP           NODE
app-5586fc9d77-6dcg2   1/1     Running   0          3m6s   10.244.1.6   kind-worker2

As always, we can see the full picture of details with kubectl describe

 kubectl describe endpoints clusterip-service
Name:         clusterip-service
Namespace:    default
Labels:       app=app
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-01-30T19:14:23Z
Subsets:
  Addresses:          10.244.1.6,10.244.2.7,10.244.2.8,10.244.3.9
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  8080  TCP

Events:
  Type     Reason                  Age   From                 Message
  ----     ------                  ----  ----                 -------

For large deployments that endpoint object can become very large. So much so that it can actually slow down changes in the cluster. To solve that issue, the Kubernetes maintainers have come up with Endpoint Slices.

Endpoint Slices

You may be asking how are they different from Endpoints? This is where we really start to get into the weeds of Kubernetes networking.

In a typical cluster, Kubernetes runs kube-proxy on every node. kube-proxy is responsible for the per-node portions of making Services work, by handling routing and outbound load balancing to all the pods in a Service. To do that, kube-proxy watches all endpoints in the cluster, so that it knows all applicable pods that all services should route to.

Now, imagine we have a big cluster, with thousands of nodes, and tens of thousands of pods. That means thousands of kube-proxies are watching endpoints. When an address changes in an endpoints object (say, from a rolling update, scale up, eviction, healthcheck failure, or any number of reasons), the updated endpoints object is pushed to all listening kube-proxies. It is made worse by the number of pods, since more pods means larger endpoints objects, and more frequent changes. This eventually becomes a strain on etcd, the Kubernetes apiserver, and the network itself. Kubernetes scaling limits are complex and depend on specific criteria, but endpoints watching is a common problem in clusters that have thousands of nodes. Anecdotally, many Kubernetes users consider endpoints watches to be the ultimate bottleneck of cluster size.

This problem is a function of kube-proxy’s design, and the expectation that any pod should be immediately be able to route to any service with no notice. EndpointSlices are an approach that allows kube-proxy’s fundamental design to continue, while drastically reducing the watch bottleneck in large clusters where large services are used.

EndpointSlices have similar contents to Endpoints objects but also include an array of Endpoint objects.

apiVersion: discovery.k8s.io/v1beta1
kind: EndpointSlice
metadata:
  name: demo-slice-1
  labels:
    kubernetes.io/service-name: demo
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 80
endpoints:
  - addresses:
      - "10.0.0.1"
    conditions:
      ready: true

The meaningful difference between endpoints and endpointslices is not the schema, but how Kubernetes treats them. With “regular” endpoints, a Kubernetes Service creates one endpoint object for all pods in the Service. A Service creates multiple EndpointSlices, each containing a subset of pods, Figure 5-2 depicts this subset. The union of all EndpointSlices for a service contains all pods in the service. This way, an IP address change (due to a new pod, deleted pod, or a pod’s health changing) will result in a much smaller data transfer to watchers. Because Kubernetes doesn’t have a transactional API, the same address may appear temporarily in multiple slices. Any code consuming EndpointSlices (such as kube-proxy) must be able to account for this.

EndpointsVSliice
Figure 5-2. Endpoints versus Endpointslice objects

The maximum number of addresses in an EndpointsSlice is set using the --max-endpoints-per-slice kube-controller-manager flag. The current default is 100, and the maximum is 1000. The endpointslice controller attempts to fill existing EndpointSlices before creating new ones, but does not rebalance EndpointSlices.

The endpointslice controller mirrors endpoints to endpointslices, to allow systems to continue writing endpoints while treating endpointslices as the source of truth. The exact future of this behavior, and Endpoints in general, has not been finalized (however, as a v1 resource, Endpoints would be sunset with substantial notice). There are 4 exceptions that will prevent mirroring:

  • There is no corresponding Service.

  • the corresponding Service resource selects pods.

  • The Endpoints object has the label endpointslice.kubernetes.io/skip-mirror: true.

  • The Endpoints object has the annotation control-plane.alpha.kubernetes.io/leader.

You can fetch all EndpointSlices for a specific Service, by fetching EndpointSlices filtered to the desired name in .metadata.labels."kubernetes.io/service-name".

Warning

EndpointSlices have been in beta state since Kubernetes 1.17. This is still the case of Kubernetes 1.20, the current version at the time of writing. Beta resources typically don’t see major changes, and eventually graduate to stable APIs, but that is not a guaranteed. If you directly use EndpointSlices, be aware that a future Kubernetes release may make a breaking change without much warning, or the behaviors described here may change.

Let see some endpoints running in the cluster now with kubectl get endpointslice

kubectl get endpointslice
NAME                      ADDRESSTYPE   PORTS   ENDPOINTS
clusterip-service-l2n9q   IPv4          8080    10.244.2.7,10.244.2.8,10.244.1.5 + 1 more...

If we want more detail about the endpointslice clusterip-service-l2n9q we can use kubectl describe on it.

kubectl describe endpointslice clusterip-service-l2n9q
Name:         clusterip-service-l2n9q
Namespace:    default
Labels:       endpointslice.kubernetes.io/managed-by=endpointslice-controller.k8s.io
              kubernetes.io/service-name=clusterip-service
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-01-30T18:51:36Z
AddressType:  IPv4
Ports:
  Name     Port  Protocol
  ----     ----  --------
  <unset>  8080  TCP
Endpoints:
  - Addresses:  10.244.2.7
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/app-5586fc9d77-qpxwk
    Topology:   kubernetes.io/hostname=kind-worker
  - Addresses:  10.244.2.8
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/app-5586fc9d77-tpz8q
    Topology:   kubernetes.io/hostname=kind-worker
  - Addresses:  10.244.1.5
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/app-5586fc9d77-7frts
    Topology:   kubernetes.io/hostname=kind-worker2
  - Addresses:  10.244.3.9
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/app-5586fc9d77-mxhgw
    Topology:   kubernetes.io/hostname=kind-worker3
Events:         <none>

In the output, we see the pod powering the endpointslice from TargetRef. The Topology information gives us hostname of the worker node that the pod is deployed too. Most importantly the Addresses returns the IP address of the endpoint object.

Endpoints and endpointslices are important to understand because the identity the pods responsible for the Services, no matter the type deployed. Later in the chapter we review how to use endpoints and labels for troubleshooting. Next we will investigate all the Kubernetes Service Types.

Kubernetes Services

A Service in Kubernetes is a load balancing abstraction within a cluster. There are four types of services, specified by the .spec.Type field. Each type offers a different form of load balancing or discovery, which we will cover individually. The four types are: ClusterIP, NodePort, LoadBalancer, and ExternalName.

Services use a standard pod selector to match pods. The Service will include all matching pods. Services create an endpoints (or endpointsslice) object to handle pod discovery.

apiVersion: v1
kind: Service
metadata:
  name: demo-service
spec:
  selector:
    app: demo

We will use the Golang minimal webserver for all the services examples. We have added additional functionality to the application to display the host and pod ip in the Rest request.

Figure 5-3 outlines our Pod networking status as a single pod in a cluster. The networking objects are we are about to explore will expose our app pods external the cluster in some instances and in others allows us to scale our application to meet demand. Recall from Chapters 3 and 4 that containers running inside pods share network namespace, among others, there is also a pause container that is created for each pod. The pause container runs manages the namespaces for the pod.

Note

The Pause Container is the parent container for all running containers in the Pod. It holds and shares all the namespaces for the pod. More about the Pause container can be read by Ian Lewis Blog Post https://www.ianlewis.org/en/almighty-pause-container

Pod on Host
Figure 5-3. Pod on Host

Before we deploy the services, we must first deploy the web server that the services will be routing traffic too, if we have not already.

 kubectl apply -f web.yaml
deployment.apps/app created

kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP           NODE
app-9cc7d9df8-ffsm6   1/1     Running   0          49s   10.244.1.4   kind-worker2
dnsutils              1/1     Running   0          49m   10.244.3.2   kind-worker3
postgres-0            1/1     Running   0          48m   10.244.1.3   kind-worker2
postgres-1            1/1     Running   0          48m   10.244.2.3   kind-worker

Let’s look at each type of service starting with NodePort.

NodePort

A NodePort Service provides a simple way for external software, such as a load balancer, to route traffic to the pods . The software only needs to be aware of node IPs, and the Service’s port(s).A NodePort Service exposes a fixed port on all nodes, which routes to applicable pods. A NodePort Service uses the .spec.ports.[].nodePort field to specify the port to open on all nodes, for the corresponding port on pods.

apiVersion: v1
kind: Service
metadata:
  name: demo-service
spec:
  type: NodePort
  selector:
    app: demo
  ports:
    - port: 80
      targetPort: 80
      nodePort: 30000

The nodePort field can be left blank, in which case Kubernetes will automatically select a unique port. The flag --service-node-port-range in kube-controller-manager sets the valid range for ports,30000-32767. Manually specified ports must be within this range.

Using a NodePort service external users can connect to the nodeport on any node, and be routed to a pod on a node that has a pod backing that service — Figure 5-4 demonstrates this. The service directs traffic to node 3 and Iptables rules forward the traffic to Node 2 hosting the pod This is a bit inefficient, as a typical connection will be routed to a pod on another node.

Node Port Traffic Flow
Figure 5-4. Node Port Traffic Flow

Figure 5-4 requires us to discuss an attribute of Services, externalTrafficPolicy. ExternalTrafficPolicy indicates how a Service will route external traffic to either node-local or cluster-wide endpoints. “Local” preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type services but risks potentially imbalanced traffic spreading. “Cluster” obscures the client source IP and may cause a second hop to another node but should have good overall load-spreading. A “Cluster” value means that for each worker node, the kube-proxy iptable rules are set up to route the traffic to the pods backing the service anywhere in the cluster, just like we have shown in Figure 5-4.

A “Local” value means the kube-proxy iptable rules are set up only on the worker nodes with relevant pods running to route the traffic local to the worker node. Using Local also allows application developers to preserve the source IP of the user request. If you set externalTrafficPolicy to the value Local, kube-proxy will only proxies requests to node local endpoints and will not forward traffic to other nodes. If there are no local endpoints, packets sent to the node are dropped.

Let us scale up the Deployment of our web app for some more testing.

 kubectl scale deployment app --replicas 4
deployment.apps/app scaled

 kubectl get pods -l app=app -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP           NODE
app-9cc7d9df8-9d5t8   1/1     Running   0          43s   10.244.2.4   kind-worker
app-9cc7d9df8-ffsm6   1/1     Running   0          75m   10.244.1.4   kind-worker2
app-9cc7d9df8-srxk5   1/1     Running   0          45s   10.244.3.4   kind-worker3
app-9cc7d9df8-zrnvb   1/1     Running   0          43s   10.244.3.5   kind-worker3

With four pods running we will have one pod at every node in the cluster.

 kubectl get pods -o wide -l app=app
NAME                   READY   STATUS    RESTARTS   AGE   IP           NODE
app-5586fc9d77-7frts   1/1     Running   0          31s   10.244.1.5   kind-worker2
app-5586fc9d77-mxhgw   1/1     Running   0          31s   10.244.3.9   kind-worker3
app-5586fc9d77-qpxwk   1/1     Running   0          84s   10.244.2.7   kind-worker
app-5586fc9d77-tpz8q   1/1     Running   0          31s   10.244.2.8   kind-worker

Now let’s deploy our NodePort Service

kubectl apply -f services-nodeport.yaml
service/nodeport-service created

kubectl describe svc nodeport-service
Name:                     nodeport-service
Namespace:                default
Labels:                   <none>
Annotations:              Selector:  app=app
Type:                     NodePort
IP:                       10.101.85.57
Port:                     echo  8080/TCP
TargetPort:               8080/TCP
NodePort:                 echo  30040/TCP
Endpoints:                10.244.1.5:8080,10.244.2.7:8080,10.244.2.8:8080 + 1 more...
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

In order to test the nodeport service we must retrieve the IP address of a worker node

kubectl get nodes -o wide
NAME                 STATUS   ROLES   INTERNAL-IP OS-IMAGE       KERNEL-VERSION
kind-control-plane   Ready    master  172.18.0.5  Ubuntu 19.10   4.19.121-linuxkit
kind-worker          Ready    <none>  172.18.0.3  Ubuntu 19.10   4.19.121-linuxkit
kind-worker2         Ready    <none>  172.18.0.4  Ubuntu 19.10   4.19.121-linuxkit
kind-worker3         Ready    <none>  172.18.0.2  Ubuntu 19.10   4.19.121-linuxkit

Communication external to the cluster will use the NodePort of 30040 opened on each worker and the node worker’s IP address.

We can see that our pods are reachable on each host in the cluster.

kubectl exec -it dnsutils -- wget -q -O-  172.18.0.5:30040/host
NODE: kind-worker2, POD IP:10.244.1.5

kubectl exec -it dnsutils -- wget -q -O-  172.18.0.3:30040/host
NODE: kind-worker, POD IP:10.244.2.8

kubectl exec -it dnsutils -- wget -q -O-  172.18.0.4:30040/host
NODE: kind-worker2, POD IP:10.244.1.5

It’s important to consider the limitations as well. A nodeport deployment will fail if it can not allocate the requested port. Also, ports must be tracked across all applications using a NodePort service. Using manually selected ports raises the issue of port collisions (especially when applying a workload to multiple clusters, which may not have the exact same nodeports free).

Another downside of using Nodeport service type is that the load balancer or client software must be aware of node IP addresses. A static configuration (e.g., an operator manually copying node IP addresses) may become too outdated over time, (especially on a cloud provider) as IP addresses change or nodes are replaced. A reliable system automatically populates node IP addresses, either by watching which machines have been allocated to the cluster, or listing nodes from the Kubernetes API itself.

NodePorts are the earliest form of services. We will see that other services types use node ports as a base structure in their architecture. Nodeports should not be used by themselves, as clients would need to know the IP addresses of hosts and the node for connections requests. We will see how nodeports are used to enable load balancers later in the chapter and when we discuss the cloud networks.

Next up is the Services default type, the ClusterIP.

ClusterIP

The IP address of pods share the lifecycle of it and thus are not reliable for clients to use for requests. Services help overcome this pod networking design. A ClusterIP Service provides an internal load balancer, with a single IP address that maps to all matching (and ready) pods.

The Service’s IP address must be within the CIDR set in service-cluster-ip-range, in the apiserver. You can specify a valid IP address manually, or leave .spec.clusterIP unset to have one assigned automatically. The ClusterIP address is a virtual IP address that is only routable internally.

kube-proxy is responsible for making the ClusterIP address route to all applicable pods. See the section on kube-proxy for more. In “normal” configurations, kube-proxy performs L4 load balancing, which may not be sufficient. For example, older pods may see more load, due to accumulating more long-lived connections from clients. Or, a few clients making many requests may cause load to be distributed unevenly.

A particular use case example for ClusterIP is when a workload requires a load balancer within the same cluster.

In Figure 5-5, we can see a ClusterIP service deployed. The Service name is app with a selector or App=App1 There are two pods powering this service. Pod 1 and Pod 5 match the selector for the service.

Cluster IP
Figure 5-5. Cluster IP Example Service

Let us dig into an example on the command line with our kind cluster.

We will deploy a ClusterIP service for use with our Golang webserver.

kubectl apply -f service-clusterip.yaml
service/clusterip-service created

kubectl describe svc clusterip-service
Name:              clusterip-service
Namespace:         default
Labels:            app=app
Annotations:       Selector:  app=app
Type:              ClusterIP
IP:                10.98.252.195
Port:              <unset>  80/TCP
TargetPort:        8080/TCP
Endpoints:         <none>
Session Affinity:  None
Events:            <none>

The clusterip service name is resolvable in the network.

kubectl exec dnsutils -- host clusterip-service
clusterip-service.default.svc.cluster.local has address 10.98.252.195

Now we can reach the Host API endpoint with The Cluster IP, 10.98.252.195, The Service Name, clusterip-service, or directly with the pod IP 10.244.1.4 and port 8080.

kubectl exec dnsutils -- wget -q -O- clusterip-service/host
NODE: kind-worker2, POD IP:10.244.1.4

kubectl exec dnsutils -- wget -q -O- 10.98.252.195/host
NODE: kind-worker2, POD IP:10.244.1.4

kubectl exec dnsutils -- wget -q -O- 10.244.1.4:8080/host
NODE: kind-worker2, POD IP:10.244.1.4

The clusterIP service is the default type for Services. With that default status, it is warranted that we should explore what the ClusterIP service abstracted for us. If you recall from Chapter 2 and 3, this list is similar to what is set up with Docker Network, but we now also have iptables for the service across all nodes.

  • View veth pair and match with pod

  • View network namespace and match with pod

  • Verify pids on node match pods

  • Match services with iptables rules

To explore this we need to know what Worker node the pod is deployed too, and that is kind-worker2

kubectl get pods -o wide --field-selector spec.nodeName=kind-worker2 -l app=app
NAME                  READY   STATUS    RESTARTS   AGE     IP           NODE
app-9cc7d9df8-ffsm6   1/1     Running   0          7m23s   10.244.1.4   kind-worker2

Since we are using kind we can use docker ps and docker exec to get information out of the running worker node kind-worker-2

docker ps
CONTAINER ID   COMMAND                   PORTS                       NAMES
df6df0736958   "/usr/local/bin/entr…"                                kind-worker2
e242f11d2d00   "/usr/local/bin/entr…"                                kind-worker
a76b32f37c0e   "/usr/local/bin/entr…"                                kind-worker3
07ccb63d870f   "/usr/local/bin/entr…"    0.0.0.0:80->80/tcp,         kind-control-plane
                                         0.0.0.0:443->443/tcp,
                                         127.0.0.1:52321->6443/tcp

kind-worker2 container id is df6df0736958, kind was kind enough to label each container with names, so we can reference each worker node with its name kind-worker2

Let’s see our Pod’s, app-9cc7d9df8-ffsm6, IP address and route table information.

kubectl exec app-9cc7d9df8-ffsm6 ip r
default via 10.244.1.1 dev eth0
10.244.1.0/24 via 10.244.1.1 dev eth0 src 10.244.1.4
10.244.1.1 dev eth0 scope link src 10.244.1.4

Our Pods IP Address is 10.244.1.4 running on interface [email protected] with 10.244.1.1 as it’s default route. That matches the interface 5 on the pod, [email protected]

kubectl exec app-9cc7d9df8-ffsm6 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: [email protected]: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
3: [email protected]: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
    link/tunnel6 :: brd ::
5: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 3e:57:42:6e:cd:45 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.1.4/24 brd 10.244.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3c57:42ff:fe6e:cd45/64 scope link
       valid_lft forever preferred_lft forever

Let’s check the network namespace as well, from the node ip a output.

 docker exec -it kind-worker2 ip a
<trimmerd>
5: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 3e:39:16:38:3f:23 brd <> link-netns cni-ec37f6e4-a1b5-9bc9-b324-59d612edb4d4
    inet 10.244.1.1/32 brd 10.244.1.1 scope global veth45d1f3e8
       valid_lft forever preferred_lft forever

netns list confirms that the network namespaces match our pods interface to the host interface, cni-ec37f6e4-a1b5-9bc9-b324-59d612edb4d4.

docker exec -it kind-worker2 /usr/sbin/ip netns list
cni-ec37f6e4-a1b5-9bc9-b324-59d612edb4d4 (id: 2)
cni-c18c44cb-6c3e-c48d-b783-e7850d40e01c (id: 1)

Let’s see what process/es run inside that network namespace. For that we will use docker exec to run commands inside the node kind-worker2 hosting the pod and its network namespace.

 docker exec -it kind-worker2 /usr/sbin/ip netns pid cni-ec37f6e4-a1b5-9bc9-b324-59d612edb4d4
4687
4737

Now we can grep for each process id and inspect what those are doing.

docker exec -it kind-worker2 ps aux | grep 4687
root      4687  0.0  0.0    968     4 ?        Ss   17:00   0:00 /pause

docker exec -it kind-worker2 ps aux | grep 4737
root      4737  0.0  0.0 708376  6368 ?        Ssl  17:00   0:00 /opt/web-server

4737 is the process id of our Web server container running on the kind-worker2

4687 is our pause container holding onto all our namespaces.

Now let’s see what will happen to the iptables on the worker node.

docker exec -it kind-worker2 iptables -L
Chain INPUT (policy ACCEPT)
target                  prot opt source     destination
/* kubernetes service portals */
KUBE-SERVICES           all  --  anywhere   anywhere    ctstate NEW
/* kubernetes externally-visible service portals */
KUBE-EXTERNAL-SERVICES  all  --  anywhere   anywhere    ctstate NEW
KUBE-FIREWALL           all  --  anywhere   anywhere

Chain FORWARD (policy ACCEPT)
target        prot opt source     destination
/* kubernetes forwarding rules */
KUBE-FORWARD  all  --  anywhere   anywhere
/* kubernetes service portals */
KUBE-SERVICES all  --  anywhere   anywhere             ctstate NEW

Chain OUTPUT (policy ACCEPT)
target          prot opt source               destination
/* kubernetes service portals */
KUBE-SERVICES   all  --  anywhere             anywhere             ctstate NEW
KUBE-FIREWALL   all  --  anywhere             anywhere

Chain KUBE-EXTERNAL-SERVICES (1 references)
target     prot opt source               destination

Chain KUBE-FIREWALL (2 references)
target     prot opt source    destination
/* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere  anywhere   mark match 0x8000/0x8000

Chain KUBE-FORWARD (1 references)
target  prot opt source    destination
DROP    all  --  anywhere  anywhere    ctstate INVALID
/*kubernetes forwarding rules*/
ACCEPT  all  --  anywhere  anywhere     mark match 0x4000/0x4000
/*kubernetes forwarding conntrack pod source rule*/
ACCEPT  all  --  anywhere  anywhere     ctstate RELATED,ESTABLISHED
/*kubernetes forwarding conntrack pod destination rule*/
ACCEPT  all  --  anywhere  anywhere     ctstate RELATED,ESTABLISHED

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination

Chain KUBE-SERVICES (3 references)
target     prot opt source               destination

That is a lot of tables being managed by Kubernetes.

We can dive a little deeper to examine the iptables responsible for our services we deployed. Let us retrieve the IP Address of the clusterip-service deployed. We need this to find the matching iptables rules.

kubectl get svc clusterip-service
NAME                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
clusterip-service   ClusterIP   10.98.252.195    <none>        80/TCP     57m

Now use the cluster ip of the service, 10.98.252.195 , to find our iptables rule.

docker exec -it  kind-worker2 iptables -L -t nat | grep 10.98.252.195
/* default/clusterip-service: cluster IP */
KUBE-MARK-MASQ  tcp  -- !10.244.0.0/16        10.98.252.195 tcp dpt:80
/* default/clusterip-service: cluster IP */
KUBE-SVC-V7R3EVKW3DT43QQM  tcp  --  anywhere  10.98.252.195 tcp dpt:80

List out all the rules on the chain KUBE-SVC-V7R3EVKW3DT43QQM

docker exec -it  kind-worker2 iptables -t nat -L KUBE-SVC-V7R3EVKW3DT43QQM
Chain KUBE-SVC-V7R3EVKW3DT43QQM (1 references)
target     prot opt source               destination
/* default/clusterip-service: */
KUBE-SEP-THJR2P3Q4C2QAEPT  all  --  anywhere             anywhere

The KUBE-SEP- will container the endpoints for the services, KUBE-SEP-THJR2P3Q4C2QAEPT

Now we can see what the rules for this chain are in iptables.

docker exec -it kind-worker2 iptables -L KUBE-SEP-THJR2P3Q4C2QAEPT -t nat
Chain KUBE-SEP-THJR2P3Q4C2QAEPT (1 references)
target          prot opt source          destination
/* default/clusterip-service: */
KUBE-MARK-MASQ  all  --  10.244.1.4      anywhere
/* default/clusterip-service: */
DNAT            tcp  --  anywhere        anywhere    tcp to:10.244.1.4:8080

10.244.1.4:8080 is one of the services endpoints, aka a pod backing the service, which is confirmed with the output of kubectl get ep clusterip-service.

kubectl get ep clusterip-service
NAME                ENDPOINTS                         AGE
clusterip-service   10.244.1.4:8080                   62m

kubectl describe ep clusterip-service
Name:         clusterip-service
Namespace:    default
Labels:       app=app
Annotations:  <none>
Subsets:
  Addresses:          10.244.1.4
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  8080  TCP

Events:  <none>

Now, let’s explore the limitations of ClusterIP. The ClusterIP is for internal traffic to the cluster. ClusterIP suffers the same issues as endpoints does. As the service size grows, updates to it will slow. In chapter two we discussed how to mitigate that by using IPVS over Iptables as the proxy mode for kubeproxy. We will discuss later in this chapter how to get traffic into the cluster using Ingress and the other service type Loadbalancer.

The ClusterIP is the default type of services, but there are several other specific types of services; headless and externalName. ExternalName is a specific type of services that helps with reaching services outside the cluster. Headless we briefly touched on with statefulsets, but let’s review those in depth now.

Headless

A Headless Service isn’t a formal type of service (i.e., there is no .spec.type: Headless. A Headless service is a service with .spec.clusterIP: "None". This is distinct from merely not setting a ClusterIP, which makes Kubernetes automatically assign a ClusterIP.

When ClusterIP is set to “None”, the Service does not support any load balancing functionality. Instead, it only provisions an Endpoints object, and points the service DNS record at all pods that are selected and ready.

A Headless service provides a generic way to watch Endpoints, without needing to interact with the Kubernetes API. Fetching DNS records is much simpler than integrating with the Kubernetes API, and it may not be possible with third party software.

Headless services allows developers to deploy multiple copies of a pod in a deployment. Instead of a single IP address returned, like with ClusterIP, all the IP addresses of the endpoint are returned in the query. It then up to to client to pick which one to use. To see this in action let us scale up the Deployment of our web app.

 kubectl scale deployment app --replicas 4
deployment.apps/app scaled

 kubectl get pods -l app=app -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP           NODE
app-9cc7d9df8-9d5t8   1/1     Running   0          43s   10.244.2.4   kind-worker
app-9cc7d9df8-ffsm6   1/1     Running   0          75m   10.244.1.4   kind-worker2
app-9cc7d9df8-srxk5   1/1     Running   0          45s   10.244.3.4   kind-worker3
app-9cc7d9df8-zrnvb   1/1     Running   0          43s   10.244.3.5   kind-worker3

Now let us deploy the headless service.

kubectl apply -f service-headless.yml
service/headless-service created

The DNS query will return all four of the Pod IP addresses. Using our dnsutils image we can verify that is the case.

kubectl exec dnsutils -- host -v -t a headless-service
Trying "headless-service.default.svc.cluster.local"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45294
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;headless-service.default.svc.cluster.local. IN A

;; ANSWER SECTION:
headless-service.default.svc.cluster.local. 30 IN A 10.244.2.4
headless-service.default.svc.cluster.local. 30 IN A 10.244.3.5
headless-service.default.svc.cluster.local. 30 IN A 10.244.1.4
headless-service.default.svc.cluster.local. 30 IN A 10.244.3.4

Received 292 bytes from 10.96.0.10#53 in 0 ms

That ip addresses returned from the query also matches the Endpoints for the service. kubectl describe for the endpoint object confirms that.

 kubectl describe endpoints headless-service
Name:         headless-service
Namespace:    default
Labels:       service.kubernetes.io/headless
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-01-30T18:16:09Z
Subsets:
  Addresses:          10.244.1.4,10.244.2.4,10.244.3.4,10.244.3.5
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  8080  TCP

Events:  <none>

Headless has a very specific use case and is not typically used for deployments. As we mentioned in the Statefulset section, if developers need to let the client decide which endpoint to use, Headless is the appropriate type of service to deploy. Two examples of headless services are clustered databases and applications that have client-side load-balancing logic built-in to the code.

Our next example is ExternalName, which aids in migrations of services external to cluster. It also offers other DNS advantages inside cluster DNS.

ExternalName Service

ExternalName is a special case of Service that does not have selectors and uses DNS names instead.

When looking up the host ext-service.default.svc.cluster.local, the cluster DNS Service returns a CNAME record of database.mycompany.com

apiVersion: v1
kind: Service
metadata:
  name: ext-service
spec:
  type: ExternalName
  externalName: database.mycompany.com

If developers are migrating an application into Kubernetes but its dependencies are staying external to the cluster An ExternalName allows you to define a DNS record internal to the cluster no matter where the service actually runs.

DNS will try all the search as seen in the example below.

 kubectl exec -it dnsutils -- host -v -t a github.com
Trying "github.com.default.svc.cluster.local"
Trying "github.com.svc.cluster.local"
Trying "github.com.cluster.local"
Trying "github.com"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55908
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;github.com.                    IN      A

;; ANSWER SECTION:
github.com.             30      IN      A       140.82.112.3

Received 54 bytes from 10.96.0.10#53 in 18 ms

As an example, externalName service allows developers to map a Service to a DNS name.

Now if we deploy the External Service

kubectl apply -f service-external.yml
service/external-service created

The A record for github.com is return from the external-service query.

kubectl exec -it dnsutils -- host -v -t a external-service
Trying "external-service.default.svc.cluster.local"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11252
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;external-service.default.svc.cluster.local. IN A

;; ANSWER SECTION:
external-service.default.svc.cluster.local. 24 IN CNAME github.com.
github.com.             24      IN      A       140.82.112.3

Received 152 bytes from 10.96.0.10#53 in 0 ms

The CNAME for external service returns github.com.

kubectl exec -it dnsutils -- host -v -t cname external-service
Trying "external-service.default.svc.cluster.local"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36874
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;external-service.default.svc.cluster.local. IN CNAME

;; ANSWER SECTION:
external-service.default.svc.cluster.local. 30 IN CNAME github.com.

Received 126 bytes from 10.96.0.10#53 in 0 ms

Sending traffic to a Headless Service via DNS record is possible, but inadvisable. DNS is a notoriously poor way to load balancer, as software takes very different (and often simple or unintuitive) approaches to A or AAAA DNS records that return multiple IP addresses. For example, it is common for software to always choose the first IP address in the response, and/or caching reusing the same IP address indefinitely. If you need to be able to send traffic to the Service’s DNS address, consider a (standard) ClusterIP or LoadBalancer service.

The “correct” way to use a Headless service is to query the Service’s A/AAAA DNS record, and use that data in a serverside or clientside load balancer.

Most of the services we have been discussing are for internal traffic management for the cluster network. In our next sections will be reviewing how to route requests into the cluster with Service Type Loadbalancer and Ingresses.

LoadBalancer

LoadBalancer Services expose services external to the cluster network. They combine NodePort Service behavior with an external integration, such as a cloud provider’s load balancer. Notably, LoadBalancer services handle L4 traffic (unlike Ingress, which handles L7 traffic), so they will work for any TCP or UDP service, provided the load balancer selected supports L4 traffic.

Configuration and load balancer options are extremely dependent on the cloud provider. For example, some will support .spec.loadBalancerIP (with varying setup required), and some will ignore it.

apiVersion: v1
kind: Service
metadata:
  name: demo-service
spec:
  selector:
    app: demo
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  clusterIP: 10.0.5.1
  type: LoadBalancer

Once the load balancer has been provisioned, its IP address will be written to .status.loadBalancer.ingress.ip.

LoadBalancer Services are useful for exposing TCP or UDP services to the outside world. Traffic will come into to the load balancer on its public IP address address and TCP port 80, defined by spec.ports[*].port and routed to the clusterIP address,10.0.5.1, and then to container target port 8080,spec.ports[*].targetPort. Not shown in the example the .spec.ports[*].nodePort, if not specified Kubernetes will pick one for the service.

Tip

The Service’s spec.ports[*].targetPort must match your pod’s container applications spec.container[*].ports.containerPort, along with the protocol. It’s like missing a semicolon in k8s networking otherwise.

In Figure 5-6 we can see how Load Balancer builds on the other service types. The Cloud load balancer will determine how to distribute traffic; we will discuss that in depth in next chapter.

LoadBalancer Service
Figure 5-6. Load Balancer Service

Let us continue to extend our golang web server example with a LoadBalancer Service.

Since we are running on our local machine and not in a Service Provider like AWS, GCP or Azure, we can use MetalLB as an example for our Loadbalancer service. MetalLB project aims to allow users to deploy Bare Metal Loadbalancers for their clusters.

This example has been modified from Kind example deployment https://kind.sigs.k8s.io/docs/user/loadbalancer

Our first step is to deploy a separate namespace for MetalLB.

kubectl apply -f mlb-ns.yaml
namespace/metallb-system created

MetalLB Members also require a secret for joining the loadbalancer cluster, let us deploy one now for them to use in our cluster.

kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
secret/memberlist created

Now we can deploy MetalLB!

 kubectl apply -f ./metallb.yaml
podsecuritypolicy.policy/controller created
podsecuritypolicy.policy/speaker created
serviceaccount/controller created
serviceaccount/speaker created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created
role.rbac.authorization.k8s.io/config-watcher created
role.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created
rolebinding.rbac.authorization.k8s.io/config-watcher created
rolebinding.rbac.authorization.k8s.io/pod-lister created
daemonset.apps/speaker created
deployment.apps/controller created

As you can see it deploys many objects, and now we wait for the deployment to finish. We can monitor the deployment of resources with --watch option in the metallb-system namespace.

kubectl get pods -n metallb-system --watch
NAME                          READY   STATUS              RESTARTS   AGE
controller-5df88bd85d-mvgqn   0/1     ContainerCreating   0          10s
speaker-5knqb                 1/1     Running             0          10s
speaker-k79c9                 1/1     Running             0          10s
speaker-pfs2p                 1/1     Running             0          10s
speaker-sl7fd                 1/1     Running             0          10s
controller-5df88bd85d-mvgqn   1/1     Running             0          12s

To complete configuration, we need to provide metallb a range of IP addresses it controls. This his range has to be on the docker kind network.

docker network inspect -f '{{.IPAM.Config}}' kind
[{172.18.0.0/16  172.18.0.1 map[]} {fc00:f853:ccd:e793::/64  fc00:f853:ccd:e793::1 map[]}]

172.18.0.0/16 is our docker network running locally.

We want our loadbalancer IP range to come from this subclass. We can configure metallb, for instance, to use 172.18.255.200 to 172.18.255.250 by creating the configmap.

The config map would look like this:

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 172.18.255.200-172.18.255.250

Let us deploy it so we can use MetalLB.

kubectl apply -f ./metallb-configmap.yaml

Now that we MetalLB deploy we deploy a loadbalancer for our Web app.

kubectl apply -f services-loadbalancer.yaml
service/loadbalancer-service created

For fun let us scale the web app deployment to 10, if you have the resources for it!

kubectl scale deployment app --replicas 10

 kubectl get pods -o wide
NAME                    READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
app-7bdb9ffd6c-b5x7m    2/2     Running   0          26s   10.244.3.15   kind-worker    <none>           <none>
app-7bdb9ffd6c-bqtf8    2/2     Running   0          26s   10.244.2.13   kind-worker2   <none>           <none>
app-7bdb9ffd6c-fb9sf    2/2     Running   0          26s   10.244.3.14   kind-worker    <none>           <none>
app-7bdb9ffd6c-hrt7b    2/2     Running   0          26s   10.244.2.7   kind-worker2    <none>           <none>
app-7bdb9ffd6c-l2794    2/2     Running   0          26s   10.244.2.9   kind-worker2    <none>           <none>
app-7bdb9ffd6c-l4cfx    2/2     Running   0          26s   10.244.3.11   kind-worker2   <none>           <none>
app-7bdb9ffd6c-rr4kn    2/2     Running   0          23m   10.244.3.10   kind-worker    <none>           <none>
app-7bdb9ffd6c-s4k92    2/2     Running   0          26s   10.244.3.13   kind-worker    <none>           <none>
app-7bdb9ffd6c-shmdt    2/2     Running   0          26s   10.244.1.12   kind-worker3   <none>           <none>
app-7bdb9ffd6c-v87f9    2/2     Running   0          26s   10.244.1.11   kind-worker3   <none>           <none>
app2-658bcd97bd-4n888   1/1     Running   0          35m   10.244.2.6    kind-worker3   <none>           <none>
app2-658bcd97bd-mnpkp   1/1     Running   0          35m   10.244.3.7    kind-worker    <none>           <none>
app2-658bcd97bd-w2qkl   1/1     Running   0          35m   10.244.3.8    kind-worker    <none>           <none>
dnsutils                1/1     Running   1          75m   10.244.1.2    kind-worker3   <none>           <none>
postgres-0              1/1     Running   0          75m   10.244.1.4    kind-worker3   <none>           <none>
postgres-1              1/1     Running   0          75m   10.244.3.4    kind-worker    <none>           <none>

Now we can test the provisioned load balancer.

With more replicas deployed for our App behind the loadbalancer, we need the external IP of the Loadbalancer, 172.18.255.200.

kubectl get svc loadbalancer-service
NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)        AGE
loadbalancer-service   LoadBalancer   10.99.24.220   172.18.255.200   80:31276/TCP   52s


kubectl get svc/loadbalancer-service -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'
172.18.255.200

Since Docker for Mac or Windows does not expose the kind network to the host, we can not directly reach the 172.18.255.200 Loadbalancer IP on the Docker private network.

We can simulate it by attaching a docker container to the kind network and curling the Loadbalancer as a workaround.

Tip

If you would like to read more about this issue there is a great blog post here https://www.thehumblelab.com/kind-and-metallb-on-mac/

We will use another great networking docker image called nicolaka/netshoot, to run locally, attach to the kind docker network, and send requests to our MetalLB Loadbalancer.

If we run it several times we can see the loadbalancer is doing its job of routing traffic to different Pods.

docker run --network kind -a stdin -a stdout -i -t nicolaka/netshoot curl 172.18.255.200/host
NODE: kind-worker, POD IP:10.244.2.7

docker run --network kind -a stdin -a stdout -i -t nicolaka/netshoot curl 172.18.255.200/host
NODE: kind-worker, POD IP:10.244.2.9

docker run --network kind -a stdin -a stdout -i -t nicolaka/netshoot curl 172.18.255.200/host
NODE: kind-worker3, POD IP:10.244.3.11

docker run --network kind -a stdin -a stdout -i -t nicolaka/netshoot curl 172.18.255.200/host
NODE: kind-worker2, POD IP:10.244.1.6

docker run --network kind -a stdin -a stdout -i -t nicolaka/netshoot curl 172.18.255.200/host
NODE: kind-worker, POD IP:10.244.2.9

With each new request the metalLB service is sending requests to different pods. Loadbalancer like other services uses selectors and labels for the pods, and we can see in the kubectl describe endpoints loadbalancer-service. The pod IP addresses match our results from the curl commands.

 kubectl describe endpoints loadbalancer-service
Name:         loadbalancer-service
Namespace:    default
Labels:       app=app
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-01-30T19:59:57Z
Subsets:
  Addresses:
  10.244.1.6,
  10.244.1.7,
  10.244.1.8,
  10.244.2.10,
  10.244.2.7,
  10.244.2.8,
  10.244.2.9,
  10.244.3.11,
  10.244.3.12,
  10.244.3.9
  NotReadyAddresses:  <none>
  Ports:
    Name          Port  Protocol
    ----          ----  --------
    service-port  8080  TCP

Events:  <none>

It is important to remember that LoadBalancer Services require specific integrations, and will not work without cloud provider support, or manually-installed software such as MetalLB.

They are not (normally) L7 load balancers, and therefore cannot intelligently handle HTTP(S) requests. There is a one to one mapping of load balancer to workload, which means that all requests sent to that load balancer must be handled by the same workload.

Tip

While not a network service, it is important to mention there is the Horizontal Pod Autoscaler service that will scale pods in a replication controller, deployment, replica set or stateful set based on CPU utilization

We can scale our Application to the demands of the Users, with no need for configuration changes on anyone part. Kubernetes and the Loadbalancer service takes care of all of that for developers, system, and network administrators.

We will see in the next chapter how we can take that even further using Cloud Services for autoscaling.

Services Conclusion

Here are some troubleshooting tips if issues arise with the endpoints or services.

Removing the label on the pod allows it to continue to run while also updating the endpoint and service. The endpoint controller will remove that unlabelled pod from the endpoints objects and the deployment will deploy another pod, this will allow you to troubleshoot issues with that specific unlabeled pod but not adverse effect the service to end customers. I’ve used this one countless times during development, and we did so in the previous sections examples.

  • There are two probes that communicated the pod’s health to the Kubelet and the rest of the Kubernetes environment

  • It is also very easy to mess YAML manifest, make sure to compare ports on the service and pods, and that they match.

  • We discuss network Policies in Chapter 3 and those can also stop pods from communicating with each other and services. If you Cluster Network is using network policies ensure that they are set up appropriately for application traffic flow

  • Also remember using diagnostic tools like the dnsutils pod, or the netshoot pods on the cluster network are helpful debugging tools.

  • If endpoints are taking too long to come up in the cluster, there are several options on that can be configured on the kubelet to control how fast it responds to change in the Kubernetes environment

    • --kube-api-qps the Query Per Second rate at which the Kubelet will use when communicating with the Kubernetes apiserver, the default 5.

    • --kube-api-burst this will temporarily allow api queries to burst to this number, the default is 10.

    • --iptables-sync-period is the maximum interval of how often iptables rules are refreshed (e.g., 5s, 1m, 2h22m). Must be greater than 0 and the is default 30s.

    • --ipvs-sync-period duration is the maximum interval of how often ipvs rules are refreshed. Must be greater than 0; Default 30s.

  • Increasing these options for larger clusters is recommended but also remember this increases the resources on both the Kubelet and the API server so keep that in mind.

These can help with alleviate issues and are good to be aware of as the number of services and pods grow in the cluster.

The various types of services exemplify how power the Network abstractions are in Kubernetes. We have dug deep into how these work for each layer of the tool chain. The developer looking to deploy applications to Kubernetes now has the knowledge to pick and choose which services are right for their use case. No longer will a Network Administrator have to manually update load balancers with IP address, with Kubernetes managing that for them.

We have just scratched the surface of what is possible with Services. With each new version of Kubernetes there are options to tune and configurations to run services. Test out each service for your use cases and ensure you are using the appropriate services to optimize your applications on the Kubernetes network.

The LoadBalancer service type is the only one that allows for traffic into the cluster. Exposing HTTP(S) services behind a load balancer, for external users to connect to. Ingresses support path-based routing, which allows different HTTP paths to be served by different services. In our next section will discuss Ingress and how it is an alternative to managing connectivity into the cluster resources

Ingress

An Ingress is a Kubernetes-specific, L7 (HTTP) load balancer, which is accessible externally, contrasted with L4 ClusterIP Service, which is internal to the cluster. This is the typical choice for exposing a HTTP(s) workload to external users. An ingress can be a single entry point into an API or a microservice based architecture. Traffic can be routed to Services based on HTTP information in the request. Ingress is a configuration spec (with multiple implementations) for routing HTTP traffic to Kubernetes Services. Figure 5-7 outlines the Ingress components.

Ingress
Figure 5-7. Ingress Architecture

In order to manage traffic for in a cluster with ingress, there two components required, the Controller and rules. The Controller manages ingress pods and the rules deployed while rules define how the traffic is routed.

Ingress Controllers and Rules

We call ingress implementations ingress controllers. In Kubernetes, a controller is software that is responsible for managing a typical resource type, and making reality match the desired state.

There are two general kinds of controllers: external load balancer controllers, and internal load balancer controllers. External load balancer controllers create a load balancer that exists “outside” the cluster, such as a cloud provider product. Internal load balancer controllers deploy a load balancer that runs within the cluster, and do not directly solve the problem of routing consumers to the load balancer. There are a myriad of ways that cluster administrators run internal load balancers, such as running the load balancer on a subset of special nodes, and routing traffic somehow to those nodes. The primary motivation for choosing an internal load balancer is cost reduction. An internal load balancer for ingress can route traffic for multiple ingress objects, whereas an external load balancer controller typically needs one load balancer per ingress. As most cloud providers charge by load balancer, it is cheaper to support a single cloud load balancer that does fan-out within the cluster, rather than many cloud load balancers. Note that this incurs operational overhead, and increased latency and compute costs, so be sure the money you’re saving is worth it. Many companies have a bad habit of optimizing on inconsequential cloud spend line items.

Let’s look at the spec for an Ingress controller. Like LoadBalancer services, most of the spec is universal, but various ingress controllers have different features and accept different unique config. We’ll start with the basics.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: basic-ingress
spec:
  rules:
  - http:
      paths:
      # Send all /demo requests to demo-service.
      - path: /demo
        pathType: Prefix
        backend:
          service:
            name: demo-service
            port:
              number: 80
  # Send all other requests to main-service.
  defaultBackend:
    service:
      name: main-service
      port:
        number: 80

The above example is representative of a typical ingress. It sends traffic to /demo to one service, and all other traffic to another. Ingresses have a “default backend”, where requests are routed if no rule matches. This can be configured in many ingress controllers in the controller configuration itself (e.g., a generic 404 page), and many support the .spec.defaultBackend field. Ingresses support multiple ways to specify a path. There are currently three.

Exact

Matches the specific path and only the given path (including trailing / or lack thereof).

Prefix

Matches all paths that start with the given path.

ImplementationSpecific

Allows for custom semantics from the current ingress controller.

When a request matches multiple paths, the most specific match is chosen. For example, if there are rules for /first and /first/second, any request starting with /first/second will go to the backend for /first/second. If a path matches an exact path and a prefix path, the request will go to the backend for the exact rule.

Ingresses can also use hostnames in rules.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: multi-host-ingress
spec:
  rules:
  - host: a.example.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: service-a
            port:
              number: 80
  - host: b.example.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: service-b
            port:
              number: 80

In this example, we serve traffic to a.example.com from one service, and traffic to b.example.com from another. This is comparable to virtualhosts in webservers. You may want to use host rules to use a single load balancer and IP to serve multiple unique domains.

Ingresses have basic TLS support.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: demo-ingress-secure
spec:
  tls:
  - hosts:
      - https-example.com
    secretName: demo-tls
  rules:
  - host: https-example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: demo-service
            port:
              number: 80

The TLS config references a Kubernetes Secret by name, in .spec.tls.[*].secretName. Ingress controllers expect the TLS certificate and key to be provided in .data."tls.crt" and .data."tls.key" respectively, as shown below.

apiVersion: v1
kind: Secret
metadata:
  name: demo-tls
type: kubernetes.io/tls
data:
  tls.crt: cert, encoded in base64
  tls.key: key, encoded in base64
Tip

If you don’t need to manage traditionally issued certificates by hand, you can use cert-manager to automatically fetch and update certs. Read more at https://cert-manager.io.

We mentioned earlier that ingress is simply a spec, and drastically different implementations exist. It’s possible to use multiple ingress controllers in a single cluster, using IngressClass settings. An ingress class represents an ingress controller, and therefore a specific ingress implementation.

Warning

Annotations in Kubernetes must be strings. Because true and false have distinct non-string meanings, you cannot set an annotation to true or false without quotes. “true” and “false” are both valid. This is a long-running bug, which is often encountered when setting a default priority class. https://github.com/kubernetes/kubernetes/issues/59113

IngressClass was introduced in Kubernetes 1.18. Prior to 1.18, annotating ingresses with kubernetes.io/ingress.class was a common convention, but relied on all installed ingress controllers to support it. Ingresses can pick an ingress class by setting the class’s name in .spec.ingressClassName.

Warning

If more than one IngressClass is set as default, Kubernetes will not allow you to create an ingress with no ingressclass, or remove the ingressclass from an existing ingress. You can use admission control to prevent multiple IngressClasses from being marked as default.

Ingress only supports HTTP(S) requests, which is insufficient if your service uses a different protocol (e.g., most databases use their own protocols). Some ingress controllers, such as the NGINX ingress controller, do support TCP and UDP, but this is not the norm.

Now onto deploying an Ingress Controller; so we can add ingress rules to our golang web server example.

When we deployed our kind cluster, we had to add several options to allow us to deploy an ingress controller.

  • extraPortMappings allow the local host to make requests to the Ingress controller over ports 80/443

  • node-labels only allow the ingress controller to run on a specific node(s) matching the label selector

There are many options to choose from with Ingress Controllers. The Kubernetes system does not start or have a default controller like it does with other pieces. The Kubernetes community does support an AWS, GCE and Nginx Ingress Controllers. Table 5-1 outlines several options for Ingress.

Table 5-1. Brief List of Ingress Controller Options
Name Commercial Support Engine Protocol Support SSL termination

Ambassador Ingress Controller

Yes

Envoy

gRPC, HTTP/2, WebSockets

Yes

Community Ingress Nginx

No

NGINX

gRPC, HTTP/2, WebSockets

Yes

NGINX Inc. Ingress

Yes

NGINX

HTTP, Websocket, gRPC

Yes

HAProxy Ingress

Yes

HAProxy

gRPC, HTTP/2, WebSockets

Yes

Istio Ingress

No

Envoy

HTTP, HTTPS, gRPC, HTTP/2

Yes

Kong Ingress Controller for Kubernetes

Yes

Lua on top of nginx

gRPC, HTTP/2

Yes

Traefik Kubernetes Ingress

Yes

Traefik

HTTP/2, gRPC, and WebSockets

Yes

Some things to consider when deciding on the Ingress for your clusters.

  • Protocol Support - Do you need more than TCP/UDP, for example gRPC integration or websocket?

  • Commercial Support - Do you need a commercial support?

  • Advanced Features - Is JWT/oAuth2 authentication or circuit breakers requirements for your applications?

  • API Gateway Features - Do you need some API Gateway functionalities such as rate-limiting?

  • Traffic distribution - Does your application require support for specialized traffic distribution like canary A/B testing or mirroring?

For our example we have chosen to use the Community version of the NGINX Ingress Controller.

Tip

For a list of more Ingress Controllers to choose from, the kubernetes.io site maintains a list here https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/

Let’s Deploy the NGINX Ingress Controller into our kind cluster.

kubectl apply -f ingress.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
serviceaccount/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created

As with all deployments, we must wait for the Controller to be ready before we can use it. With the below command we can verify if our Ingress controller is ready for use.

kubectl wait --namespace ingress-nginx 
>   --for=condition=ready pod 
>   --selector=app.kubernetes.io/component=controller 
>   --timeout=90s
pod/ingress-nginx-controller-76b5f89575-zps4k condition met

The Controller is deployed to the cluster, and now we’re ready to write Ingress rules for our application.

Deploy Ingress rules

Our yaml manifest defines several ingress rules to use with our golang web server example.

kubectl apply -f ingress-rule.yaml
ingress.extensions/ingress-resource created

kubectl get ingress
NAME               CLASS    HOSTS   ADDRESS   PORTS   AGE
ingress-resource   <none>   *                 80      4s

With describe we can see all the backends that map to clusterip service and the pods.

kubectl describe ingress
Name:             ingress-resource
Namespace:        default
Address:
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host        Path  Backends
  ----        ----  --------
  *
              /host  clusterip-service:8080 (10.244.1.6:8080,10.244.1.7:8080,10.244.1.8:8080)
Annotations:  kubernetes.io/ingress.class: nginx
Events:
  Type    Reason  Age   From                      Message
  ----    ------  ----  ----                      -------
  Normal  Sync    17s   nginx-ingress-controller  Scheduled for sync

Our ingress rule is only for /host route and will route requests to our clusterip-service:8080 service.

We can test that with curl to http://localhost/host

curl localhost/host
NODE: kind-worker2, POD IP:10.244.1.6
curl localhost/healthz

Now we can see how powerful ingresses are, let us deploy a second deployment and clusterIP service.

Our new deployment and service will be used to answer the requests for /data.

kubectl apply -f ingress-example-2.yaml
deployment.apps/app2 created
service/clusterip-service-2 configured
ingress.extensions/ingress-resource-2 configured

Now both the /host and /data work but are going to separate services.

curl localhost/host
NODE: kind-worker2, POD IP:10.244.1.6

curl localhost/data
Database Connected

Since Ingress work on Layer 7, there are many more options to route traffic with, like host header and URI path.

For more advanced traffic routing and release patterns, a Service Mesh is required to be deployed in the Cluster network. Let’s dig into that next.

Service Meshes

A new cluster with defaults options has some limitations. So let’s get an understanding for what those limitations are and how a Service Mesh can resolve some of those limitations. A Service Mesh is an API Driven Infrastructure layer for handling service-to-service communication.

From a security point of view all traffic inside the cluster is unencrypted between pods monitoring and each application team that runs a service must configure monitoring separately for each service. We have discussed the service type, but we have not discussed how to update deployments of pods for them. Service Meshes supports more than the basic deployment type rolling update and recreate, like Cannery. From a developer’s perspective, injecting faults into the network is useful, but also not directly supported in default Kubernetes’ Network deployments. With Service Meshes developers can add fault testing, and instead of just killing pods you can use service meshes to inject delays—again each application would have to build in fault testing or circuit breaking.

There are several pieces of functionality that a Service Mesh enhances or provides in a default Kubernetes cluster network.

Service Discovery

Instead of relaying on DNS for service discovery, the service mesh manages service discovery, and removes the need for it to be implemented in each individual application

Load Balancing

The Service Mesh adds more advanced load balancing algorithms such least request, consistent hashing, and zone aware

Communication Resiliency

The Service Mesh can increase Communication Resilience for Applications by not having to implement retries, timeouts, circuit-breaking, or rate limiting in application code

Security

A Service Mesh can provide

  • end-to-end encryption with mTLS between services

  • authorization policies – which authorize what services can communicate with each other, not just at the Layer 3 and 4 level like in Kubernetes network polices

Observability

Service meshes add in Observability by enriching the Layer 7 metrics, and adding tracing, and alerting

Routing Control

Traffic shifting and mirroring in the cluster

API

All of this can be controlled via an API provided by the service mesh implementation

Let’s walk through several components of a service mesh in Figure 5-8.

Service Mesh Components
Figure 5-8. Service Mesh Components

Traffic is handled differently depending on the component or destination of traffic. Traffic into and out the cluster is managed by the Gateways. While traffic between the Frontend, Backend and User service is all encrypted with Mutual TLS is handled by the Service Mesh. All the Traffic to the Frontend, Backend, and User pods in the Service mesh is proxied by the Sidecar Proxy deployed within the pods. Even if the Control plane is down and updates can not be made to the mesh, the Service and Application traffic is not effected.

There are several options to use when deploying a Service Mesh, here are highlights of just a few:

  • Istio

    • Uses a Go control plane with an Envoy Proxy

    • this is a Kubernetes-native solution that was initially released by Lyft

  • Consul

    • Uses Hashicorp Consul as the control plane

    • Consul Connect uses an agent installed on every node as a DaemonSet which communicates with the Envoy sidecar proxies that handles routing & forwarding of traffic.

  • AWS App Mesh

    • Is an AWS Managed solution that implements its own control plane

    • Des not have mTLS or traffic policy

    • Uses the Envoy proxy for the Data plane

  • Linkerd

    • Also uses Go for the control plane with the Linkerd proxy

    • No traffic shifting and no distributed tracing

    • Is a Kubernetes-only solution, which results in fewer moving pieces, and means that Linkerd has less complexity overall

It is our opinion that the best use case for a Service Mesh is mutual TLS between services. As well as other higher level use cases for developers include, circuit breaking and fault testing for API’s. For Network administrators advanced routing policies and algorithms can be deployed with service meshes.

Let’s look at a service mesh example. The first thing you need to do if you haven’t already is to install the Linkerd CLI. Those directions are at https://linkerd.io/2/getting-started/

Your choices are curl to bash or brew if you’re on mac.

curl -sL https://run.linkerd.io/install | sh

OR

brew install linkerd

linkerd version
Client version: stable-2.9.2
Server version: unavailable

Pre-flight checklist will verify that our cluster can run linkerd.

 linkerd check --pre
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

pre-kubernetes-setup
--------------------
√ control plane namespace does not already exist
√ can create non-namespaced resources
√ can create ServiceAccounts
√ can create Services
√ can create Deployments
√ can create CronJobs
√ can create ConfigMaps
√ can create Secrets
√ can read Secrets
√ can read extension-apiserver-authentication configmap
√ no clock skew detected

pre-kubernetes-capability
-------------------------
√ has NET_ADMIN capability
√ has NET_RAW capability

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

Status check results are √

The linkerd cli tool can install linkerd for us onto our kind cluster.

linkerd install | kubectl apply -f -
namespace/linkerd created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-identity created
serviceaccount/linkerd-identity created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-controller created
serviceaccount/linkerd-controller created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-destination created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-destination created
serviceaccount/linkerd-destination created
role.rbac.authorization.k8s.io/linkerd-heartbeat created
rolebinding.rbac.authorization.k8s.io/linkerd-heartbeat created
serviceaccount/linkerd-heartbeat created
role.rbac.authorization.k8s.io/linkerd-web created
rolebinding.rbac.authorization.k8s.io/linkerd-web created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-web-check created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-web-check created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-web-admin created
serviceaccount/linkerd-web created
customresourcedefinition.apiextensions.k8s.io/serviceprofiles.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/trafficsplits.split.smi-spec.io created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
serviceaccount/linkerd-proxy-injector created
secret/linkerd-proxy-injector-k8s-tls created
mutatingwebhookconfiguration.admissionregistration.k8s.io
 /linkerd-proxy-injector-webhook-config created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-sp-validator created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-sp-validator created
serviceaccount/linkerd-sp-validator created
secret/linkerd-sp-validator-k8s-tls created
validatingwebhookconfiguration.admissionregistration.k8s.io
 /linkerd-sp-validator-webhook-config created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-tap created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-tap-admin created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-tap created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-tap-auth-delegator created
serviceaccount/linkerd-tap created
rolebinding.rbac.authorization.k8s.io/linkerd-linkerd-tap-auth-reader created
secret/linkerd-tap-k8s-tls created
apiservice.apiregistration.k8s.io/v1alpha1.tap.linkerd.io created
podsecuritypolicy.policy/linkerd-linkerd-control-plane created
role.rbac.authorization.k8s.io/linkerd-psp created
rolebinding.rbac.authorization.k8s.io/linkerd-psp created
configmap/linkerd-config created
secret/linkerd-identity-issuer created
service/linkerd-identity created
service/linkerd-identity-headless created
deployment.apps/linkerd-identity created
service/linkerd-controller-api created
deployment.apps/linkerd-controller created
service/linkerd-dst created
service/linkerd-dst-headless created
deployment.apps/linkerd-destination created
cronjob.batch/linkerd-heartbeat created
service/linkerd-web created
deployment.apps/linkerd-web created
deployment.apps/linkerd-proxy-injector created
service/linkerd-proxy-injector created
service/linkerd-sp-validator created
deployment.apps/linkerd-sp-validator created
service/linkerd-tap created
deployment.apps/linkerd-tap created
serviceaccount/linkerd-grafana created
configmap/linkerd-grafana-config created
service/linkerd-grafana created
deployment.apps/linkerd-grafana created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
serviceaccount/linkerd-prometheus created
configmap/linkerd-prometheus-config created
service/linkerd-prometheus created
deployment.apps/linkerd-prometheus created
secret/linkerd-config-overrides created

As with the Ingress Controller and MetalLB we can see that a lot of components are installed in our cluster.

Linkerd can validate the installation with linkerd check cli command.

It will validate a plethora of checks for the linkerd install, included but not limited to the k8 api version, controllers, pods and configs to run linkerd, as well as all the services, versions and api’s needed to run linkerd.

linkerd check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ tap api service is running

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-prometheus
------------------
√ prometheus add-on service account exists
√ prometheus add-on config map exists
√ prometheus pod is running

linkerd-grafana
---------------
√ grafana add-on service account exists
√ grafana add-on config map exists
√ grafana pod is running

Status check results are √

Now that everything looks good with our install of Linkerd we can add our application to the Service Mesh.

kubectl -n linkerd get deploy
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
linkerd-controller       1/1     1            1           3m17s
linkerd-destination      1/1     1            1           3m17s
linkerd-grafana          1/1     1            1           3m16s
linkerd-identity         1/1     1            1           3m17s
linkerd-prometheus       1/1     1            1           3m16s
linkerd-proxy-injector   1/1     1            1           3m17s
linkerd-sp-validator     1/1     1            1           3m17s
linkerd-tap              1/1     1            1           3m17s
linkerd-web              1/1     1            1           3m17s

Let us pull up the Linkerd console to investigate what we have just deployed. We can start the console with linkerd dashboard &

This will proxy the console to our local machine available at http://localhost:50750 .

linkerd viz install | kubectl apply -f -
linkerd viz dashboard
Linkerd dashboard available at:
http://localhost:50750
Grafana dashboard available at:
http://localhost:50750/grafana
Opening Linkerd dashboard in the default browser
Tip

If you’re having issues with reaching the dashboard, you can run linkerd viz check and find more help here https://linkerd.io/2.10/tasks/troubleshooting/index.html

We can see all our deployed objects from the previous exercises in Figure 5-9.

Linkderd Dashboard
Figure 5-9. Linkerd Dashboard

Our clusterip-service is not part of the Linkerd service mesh. We will need to use the proxy injector to add our service to the mesh. It accomplishes this by watching for a specific annotation that can either be added with linkerd inject or by hand to the pod’s spec.

Let us remove some older exercise’s resources for clarity.

kubectl delete -f ingress-example-2.yaml
deployment.apps "app2" deleted
service "clusterip-service-2" deleted
ingress.extensions "ingress-resource-2" deleted

kubectl delete pods app-5586fc9d77-7frts
pod "app-5586fc9d77-7frts" deleted

kubectl delete -f ingress-rule.yaml
ingress.extensions "ingress-resource" deleted

We can use the linkerd cli to inject the proper annotations into our deployment spec so that will become part of the mesh.

We first need to get our application manifest, cat web.yaml and use linkerd to inject the annotations, linkerd inject -, then apply it back to Kubernetes api with kubectl apply -f -

cat web.yaml | linkerd inject - | kubectl apply -f -

deployment "app" injected

deployment.apps/app configured

If we describe our app deployment we can see that Linkerd has injected new annotations for us, Annotations: linkerd.io/inject: enabled.

kubectl describe deployment app
Name:                   app
Namespace:              default
CreationTimestamp:      Sat, 30 Jan 2021 13:48:47 -0500
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 3
Selector:               app=app
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:       app=app
  Annotations:  linkerd.io/inject: enabled
  Containers:
   go-web:
    Image:      strongjz/go-web:v0.0.6
    Port:       8080/TCP
    Host Port:  0/TCP
    Liveness:   http-get http://:8080/healthz delay=5s timeout=1s period=5s
    Readiness:  http-get http://:8080/ delay=5s timeout=1s period=5s
    Environment:
      MY_NODE_NAME:             (v1:spec.nodeName)
      MY_POD_NAME:              (v1:metadata.name)
      MY_POD_NAMESPACE:         (v1:metadata.namespace)
      MY_POD_IP:                (v1:status.podIP)
      MY_POD_SERVICE_ACCOUNT:   (v1:spec.serviceAccountName)
      DB_HOST:                 postgres
      DB_USER:                 postgres
      DB_PASSWORD:             mysecretpassword
      DB_PORT:                 5432
    Mounts:                    <none>
  Volumes:                     <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   app-78dfbb4854 (1/1 replicas created)
Events:
  Type   Reason            Age   From                   Message
  ----   ------            ----  ----                   -------
  Normal ScalingReplicaSet 4m4s  deployment-controller  Scaled down app-5586fc9d77
  Normal ScalingReplicaSet 4m4s  deployment-controller  Scaled up app-78dfbb4854
  Normal Injected          4m4s  linkerd-proxy-injector Linkerd sidecar injected
  Normal ScalingReplicaSet 3m54s deployment-controller  Scaled app-5586fc9d77

If we navigate to the app in the Dashboard we can see that our Deployment is part of the Linkerd Service Mesh now as shown in Figure 5-10.

http://localhost:50750/namespaces/default/deployments/app

App Linkderd Dashboard
Figure 5-10. Web App Deployment Linkerd Dashboard

The CLI can also display our stats for us.

linkerd stat deployments -n default
NAME   MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN
app       1/1   100.00%   0.4rps           1ms           1ms           1ms          1

Again let us scale up our deployment!

kubectl scale deploy app --replicas 10
deployment.apps/app scaled

In Figure 5-11 we navigate to the web browser and open this link, so we can watch the stats in real time. Select the default namespaces and in Resources our deployment/app. Then click start for the web to start displaying the metrics.

http://localhost:50750/top?namespace=default&resource=deployment%2Fapp

App Stats Dashboard
Figure 5-11. Web App Dashboard

In a separate terminal let us use the netshoot image, but this time running inside our kind cluster.

kubectl run tmp-shell --rm -i --tty --image nicolaka/netshoot -- /bin/bash
If you don't see a command prompt, try pressing enter.
bash-5.0#

Let us send a few hundred queries and see the stats.

bash-5.0#for i in `seq 1 100`;do curl http://clusterip-service/host && sleep 2; done

In our terminal we can see all the liveliness and readiness probes as well as our /host requests.

tmp-shell is our netshoot bash terminal with our for loop running.

10.244.2.1, 10.244.3.1, and 10.244.2.1 are the kubelet’s of the hosts running our probes for us.

linkerd viz stat deploy
NAME   MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN
app       1/1   100.00%   0.7rps           1ms           1ms           1ms          3

Our example only showed the observability functionality for a Service Mesh. Linkerd, istio and the like have many more options available for developers and network administrators to control, monitor and troubleshoot services running inside their cluster network. As with the Ingress Controller there are many options and features available. It is up to you and your teams to decide what functionality and features are important for your networks.

Conclusion

The Kubernetes networking world is feature rich with many options for teams to deploy, test and manage with their Kubernetes Cluster. Each new addition will add complexity and overhead to the Cluster operations. We have given developers, network and system administrators a view into the abstractions that Kubernetes offers.

From internal to the cluster,to external, Teams must choose what abstractions work best for their workloads. This is no small task, and now you are armed with the knowledge to begin those discussions.

In our next chapter we take our Kubernetes Services and network learnings to the Cloud! We will explore the network services offered by each cloud provider and how are integrated into their Kubernetes managed service offering.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset