8

Preparing the Cluster for OS Interoperability

In this chapter, we will learn about the configurations that need to be in place to successfully operate a heterogeneous Amazon EKS cluster. First, we will understand the options available to set up the VPC CNI plugin for Windows. Next, we will learn how to use taints, tolerations, and node selectors to avoid pod-scheduling disruption between Windows and Linux. At the end, we will learn about the auto-scaling options available to dynamically scale out Windows pods.

The chapter will cover the following topics:

  • Setting up the VPC CNI plugin for Windows support
  • Avoiding pod-scheduling disruption
  • Dynamically scaling out Windows pods

Setting up the VPC CNI plugin for Windows support

In order to enable Windows-based node support on Amazon EKS, two Kubernetes controllers are required to successfully route Windows pod network traffic through Amazon VPC using Amazon VPC CNI:

  • The VPC admission controller is responsible for creating and integrating AWS VPC resources on Kubernetes
  • The VPC resource controller is responsible for enabling Windows IP address management (IPAM) and instructing kube-proxy to create and maintain network rules from Windows pods all the way up to the VPC

In the past, customers enabled Windows support by deploying the VPC admission controller and VPC resource controller on the data plane, on top of a Linux node group in the kube-system namespace. The problem that came along with that approach was that not much AWS documentation was around on how to properly set up high-availability or troubleshoot such critical controllers. Well, AWS moved on and, in 2022, made life easier, by hosting both controllers on the control plane (AWS-managed).

The setup is now very simple, assuming you already have an Amazon EKS cluster up and running and that the AmazonEKSVPCResourceController managed policy is attached to your cluster role. You just need to add the following config map:

resource "kubernetes_config_map" "amazon_vpc_cni_windows" {
  depends_on = [
    aws_eks_cluster.eks_windows
  ]
  metadata {
    name      = "amazon-vpc-cni"
    namespace = "kube-system"
  }
  data = {
    enable-windows-ipam : "true"
  }
}

By setting up enable-windows-ipam : "true", both controllers are scheduled to run on the control plane, and you won’t have access to list or read logs as it is now part of an AWS responsibility in the shared responsibility model.

In this section, we learned what the VPC admission controller and VPC resource controller are and how to enable Windows support on Amazon EKS.

Avoiding pod-scheduling disruption

Running a heterogeneous Kubernetes cluster imposes some particularities that need to be considered to reduce application deployment disruption. By default, Amazon EKS uses kube-scheduler as the default scheduler for Kubernetes; when you schedule a pod, kube-scheduler uses a combination of filtering and scoring to select the optimal node to run the pod, which works nicely when you only run Linux-based applications.

kube-scheduler doesn’t take the OS into account and as a result, it is common to see Linux-based sidecar containers or plugins running as DaemonSets failing to be scheduled just because it doesn’t treat nodeSelector in their manifest. For instance, a DaemonSet ensures that all nodes run a copy of a pod, but if you have a heterogenous cluster, Windows pods cannot be deployed on Linux-based nodes and vice versa, failing to schedule the pod; thereby, your deployment will never have a 100% success rate.

We can work around this issue by implementing a combination of nodeSelector and taints and tolerations, which will give us more control over how kube-scheduler deploys pods across a heterogeneous cluster:

Figure 8.1 – Pods failing to be scheduled due to mismatched OS

Figure 8.1 – Pods failing to be scheduled due to mismatched OS

Using nodeSelector to avoid pod-schedule disruption

In Kubernetes, all nodes today are deployed with two specific Windows labels:

  • kubernetes.io/os = [windows/linux]: This label is automatically set as windows or linux based on the node's OS
  • node.kubernetes.io/windows-build: This label is automatically set to match the Windows Server version, which is retrieved by systeminfo

You can leverage the kubernetes.io/os = windows label to instruct kube-scheduler to take the OS into consideration by specifying nodeSelector properties:

...
  spec:
     containers:
     - name: windows-server-iis-ltsc2019
       image: mcr.microsoft.com/windows/servercore:ltsc2019
       ports:
       - name: http
         containerPort: 80
       imagePullPolicy: IfNotPresent
     nodeSelector:
       kubernetes.io/os: windows
...

This is the simplest way to work around this situation, and it works pretty well depending on your Kubernetes cluster’s complexity. However, it comes with the cost of having nodeSelector in all your deployment specifications. In addition, it may be a challenge if you already have many Linux deployments or community Helm charts that are rarely built with heterogenous clusters in mind.

Using taints and tolerations to avoid pod schedule disruption

Tainting your Windows nodes is the most efficient way to ensure that running Linux deployments or future ones won’t be scheduled on Windows nodes. Taints basically repeal a set of pods to be scheduled on a node or a node group, which results in a sort of dome on your Windows nodes:

Figure 8.2 – Windows nodes repealing Linux pods to be scheduled

Figure 8.2 – Windows nodes repealing Linux pods to be scheduled

This is very useful when you have different DaemonSets that automatically schedule monitoring and logging solutions that aren’t compatible with Windows and don’t use nodeSelector in its specifications. By tainting Windows nodes, you will avoid Linux-based application deployment disruption and reduce the number of deployment errors on your Amazon EKS cluster.

In order to apply taint on your Windows node group, you just need to add the following code block to the aws_eks_node_group Terraform resource:

taint {
    key = "os"
    value  = "Windows"
    effect = "NO_SCHEDULE"
  }

However, once taint is applied on the Windows node group, pods will only be scheduled to these nodes if they tolerate the taint, including Windows pods. As a next step, we need to ensure that we correctly add the toleration in the pod manifest:

      nodeSelector:
        kubernetes.io/os: windows
      tolerations:
         - key: "os"
           operator: "Equal"
           value: "windows"
           effect: "NoSchedule"

The Windows node group taint at the host level and the nodeSelector and toleration combination on your heterogeneous cluster ensure that pods are not being scheduled on the wrong OS platform, thereby reducing the scheduling disruption.

Dynamically scaling out Windows pods

This is a topic that I love, especially to get people to think about scaling out Windows pods. Container workloads and/or cloud-native applications are designed by default with the premise to scale as needed, which results in preparing the entire platform, such as container orchestrators, to support these demands.

My question for you is simple. Does the Windows application you plan to run on top of a container need to scale out? I bet if it is a 5- or 10-year-old application running on-premises, it doesn’t currently have any scale-out mechanism, and the application wasn’t designed to support that. So why should you care about it if your application doesn’t leverage this benefit?

Sometimes, we overthink technical requirements and assume premises that aren’t necessary, and all this is because people keep comparing Windows and Linux containers. Now, assuming that you will dynamically scale out with Windows pods, at this time, Cluster Autoscaler and Horizontal Pod Autoscaler are your choices.

Cluster Autoscaler is a tool that automatically adjusts the number of worker nodes in a node group to match two conditions:

  • The pod is failing to be scheduled due to insufficient resources
  • Worker nodes in the cluster have been underutilized for an extended period

Horizontal Pod Autoscaler automatically scales the number of pods based on an average CPU utilization defined by you.

By running both, you can successfully scale out your Windows pods. Horizontal Pod Autoscaler is responsible for scaling the number of Windows pods to support the demand based on pod CPU utilization, and Cluster Autoscaler provisions new Windows worker nodes to keep the number of pods that Horizontal Pod Autoscaler is demanding.

Summary

In this chapter, we covered the VPC CNI controllers for Windows fundamentals and how to enable them. Then, we covered an important topic to avoid pod scheduling disruption on an Amazon EKS heterogeneous cluster. Finally, we discussed the autoscaler controllers available for Windows pods.

In Chapter 9, Deploying a Windows Node Group, we will dive deep into deploying an Amazon EKS heterogeneous cluster and the particularities of a Windows node group.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.22.169