Autoscaling

As you can see, Deployments are a great improvement over ReplicationControllers, allowing us to seamlessly update our applications, while integrating with the other resources of Kubernetes in much the same way.

Another area that we saw in the previous chapter, and also supported for Deployments, is Horizontal Pod Autoscalers (HPAs). HPAs help you manage cluster utilization by scaling the number of pods based on CPU utilization. There are three objects that can scale using HPAs, DaemonSets not included:

Deployment (the recommended method)
ReplicaSet
ReplicationController (not recommended)

The HPA is implemented as a control loop similar to other controllers that we've discussed, and you can adjust the sensitivity of the controller manager by adjusting its sync period via --horizontal-pod-autoscaler-sync-period (default 30 seconds).

We will walk through a quick remake of the HPAs from the previous chapter, this time using the Deployments we have created so far. Save the following code in node-js-deploy-hpa.yaml file:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: node-js-deploy
spec:
  minReplicas: 3
  maxReplicas: 6
  scaleTargetRef:
    apiVersion: v1
    kind: Deployment
    name: node-js-deploy
  targetCPUUtilizationPercentage: 10

The API is changing quickly with these tools as they're in beta, so take careful note of the apiVersion element, which used to be autoscaling/v1, but is now autoscalingv2beta1.

We have lowered the CPU threshold to 10% and changed our minimum and maximum pods to 3 and 6, respectively. Create the preceding HPA with our trusty kubectl create -f command. After this is completed, we can check that it's available with the kubectl get hpa command:

Horizontal pod autoscaler

We can also check that we have only 3 pods running with the kubectl get deploy command. Now, let's add some load to trigger the autoscaler:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: boomload-deploy
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: loadgenerator-deploy
    spec:
      containers:
      - image: williamyeh/boom
        name: boom-deploy
        command: ["/bin/sh","-c"]
        args: ["while true ; do boom http://node-js-deploy/ -c 100 -n 500 ; sleep 1 ; done"]

Create boomload-deploy.yaml file as usual. Now, monitor the HPA with the alternating kubectl get hpa and kubectl get deploy commands. After a few moments, we should see the load jump above 10%. After a few more moments, we should also see the number of pods increase all the way up to 6 replicas:

HPA increase and pod scale up

Again, we can clean this up by removing our load generation pod and waiting a few moments:

$ kubectl delete deploy boomload-deploy

Again, if we watch the HPA, we'll start to see the CPU usage drop. After a few minutes, we will go back down to 0% CPU load and then the Deployment will scale back to 3 replicas.

Table of Contents for Autoscaling

Create new playlist

Sign In

Sign Up

Table of Contents for
Autoscaling