In this chapter, we will learn about the configurations that need to be in place to successfully operate a heterogeneous Amazon EKS cluster. First, we will understand the options available to set up the VPC CNI plugin for Windows. Next, we will learn how to use taints, tolerations, and node selectors to avoid pod-scheduling disruption between Windows and Linux. At the end, we will learn about the auto-scaling options available to dynamically scale out Windows pods.
The chapter will cover the following topics:
In order to enable Windows-based node support on Amazon EKS, two Kubernetes controllers are required to successfully route Windows pod network traffic through Amazon VPC using Amazon VPC CNI:
In the past, customers enabled Windows support by deploying the VPC admission controller and VPC resource controller on the data plane, on top of a Linux node group in the kube-system namespace. The problem that came along with that approach was that not much AWS documentation was around on how to properly set up high-availability or troubleshoot such critical controllers. Well, AWS moved on and, in 2022, made life easier, by hosting both controllers on the control plane (AWS-managed).
The setup is now very simple, assuming you already have an Amazon EKS cluster up and running and that the AmazonEKSVPCResourceController managed policy is attached to your cluster role. You just need to add the following config map:
resource "kubernetes_config_map" "amazon_vpc_cni_windows" { depends_on = [ aws_eks_cluster.eks_windows ] metadata { name = "amazon-vpc-cni" namespace = "kube-system" } data = { enable-windows-ipam : "true" } }
By setting up enable-windows-ipam : "true", both controllers are scheduled to run on the control plane, and you won’t have access to list or read logs as it is now part of an AWS responsibility in the shared responsibility model.
In this section, we learned what the VPC admission controller and VPC resource controller are and how to enable Windows support on Amazon EKS.
Running a heterogeneous Kubernetes cluster imposes some particularities that need to be considered to reduce application deployment disruption. By default, Amazon EKS uses kube-scheduler as the default scheduler for Kubernetes; when you schedule a pod, kube-scheduler uses a combination of filtering and scoring to select the optimal node to run the pod, which works nicely when you only run Linux-based applications.
kube-scheduler doesn’t take the OS into account and as a result, it is common to see Linux-based sidecar containers or plugins running as DaemonSets failing to be scheduled just because it doesn’t treat nodeSelector in their manifest. For instance, a DaemonSet ensures that all nodes run a copy of a pod, but if you have a heterogenous cluster, Windows pods cannot be deployed on Linux-based nodes and vice versa, failing to schedule the pod; thereby, your deployment will never have a 100% success rate.
We can work around this issue by implementing a combination of nodeSelector and taints and tolerations, which will give us more control over how kube-scheduler deploys pods across a heterogeneous cluster:
Figure 8.1 – Pods failing to be scheduled due to mismatched OS
In Kubernetes, all nodes today are deployed with two specific Windows labels:
You can leverage the kubernetes.io/os = windows label to instruct kube-scheduler to take the OS into consideration by specifying nodeSelector properties:
... spec: containers: - name: windows-server-iis-ltsc2019 image: mcr.microsoft.com/windows/servercore:ltsc2019 ports: - name: http containerPort: 80 imagePullPolicy: IfNotPresent nodeSelector: kubernetes.io/os: windows ...
This is the simplest way to work around this situation, and it works pretty well depending on your Kubernetes cluster’s complexity. However, it comes with the cost of having nodeSelector in all your deployment specifications. In addition, it may be a challenge if you already have many Linux deployments or community Helm charts that are rarely built with heterogenous clusters in mind.
Tainting your Windows nodes is the most efficient way to ensure that running Linux deployments or future ones won’t be scheduled on Windows nodes. Taints basically repeal a set of pods to be scheduled on a node or a node group, which results in a sort of dome on your Windows nodes:
Figure 8.2 – Windows nodes repealing Linux pods to be scheduled
This is very useful when you have different DaemonSets that automatically schedule monitoring and logging solutions that aren’t compatible with Windows and don’t use nodeSelector in its specifications. By tainting Windows nodes, you will avoid Linux-based application deployment disruption and reduce the number of deployment errors on your Amazon EKS cluster.
In order to apply taint on your Windows node group, you just need to add the following code block to the aws_eks_node_group Terraform resource:
taint { key = "os" value = "Windows" effect = "NO_SCHEDULE" }
However, once taint is applied on the Windows node group, pods will only be scheduled to these nodes if they tolerate the taint, including Windows pods. As a next step, we need to ensure that we correctly add the toleration in the pod manifest:
nodeSelector: kubernetes.io/os: windows tolerations: - key: "os" operator: "Equal" value: "windows" effect: "NoSchedule"
The Windows node group taint at the host level and the nodeSelector and toleration combination on your heterogeneous cluster ensure that pods are not being scheduled on the wrong OS platform, thereby reducing the scheduling disruption.
This is a topic that I love, especially to get people to think about scaling out Windows pods. Container workloads and/or cloud-native applications are designed by default with the premise to scale as needed, which results in preparing the entire platform, such as container orchestrators, to support these demands.
My question for you is simple. Does the Windows application you plan to run on top of a container need to scale out? I bet if it is a 5- or 10-year-old application running on-premises, it doesn’t currently have any scale-out mechanism, and the application wasn’t designed to support that. So why should you care about it if your application doesn’t leverage this benefit?
Sometimes, we overthink technical requirements and assume premises that aren’t necessary, and all this is because people keep comparing Windows and Linux containers. Now, assuming that you will dynamically scale out with Windows pods, at this time, Cluster Autoscaler and Horizontal Pod Autoscaler are your choices.
Cluster Autoscaler is a tool that automatically adjusts the number of worker nodes in a node group to match two conditions:
Horizontal Pod Autoscaler automatically scales the number of pods based on an average CPU utilization defined by you.
By running both, you can successfully scale out your Windows pods. Horizontal Pod Autoscaler is responsible for scaling the number of Windows pods to support the demand based on pod CPU utilization, and Cluster Autoscaler provisions new Windows worker nodes to keep the number of pods that Horizontal Pod Autoscaler is demanding.
In this chapter, we covered the VPC CNI controllers for Windows fundamentals and how to enable them. Then, we covered an important topic to avoid pod scheduling disruption on an Amazon EKS heterogeneous cluster. Finally, we discussed the autoscaler controllers available for Windows pods.
In Chapter 9, Deploying a Windows Node Group, we will dive deep into deploying an Amazon EKS heterogeneous cluster and the particularities of a Windows node group.
18.219.22.169