Horizontal Pod Autoscaler (HPA)

An HPA queries the source of metrics periodically and determines whether scaling is required by a controller based on the metrics it gets. There are two types of metrics that could be fetched; one is from Heapster (https://github.com/kubernetes/heapster), another is from RESTful client access. In the following example, we'll show you how to use Heapster to monitor Pods and expose the metrics to an HPA.

First, Heapster has to be deployed in the cluster:

If you're running minikube, use the minikube addons enable heapster command to enable heapster in your cluster. Note that minikube logs | grep heapster command could also be used to check the logs of heapster.

// at the time we're writing this book, the latest configuration file of heapster in kops is 1.7.0. Check out https://github.com/kubernetes/kops/tree/master/addons/monitoring-standalone for the latest version when you use it. 
# kubectl create -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/monitoring-standalone/v1.7.0.yaml
deployment "heapster" created
service "heapster" created
serviceaccount "heapster" created
clusterrolebinding "heapster" created
rolebinding "heapster-binding" created

Check if the heapster pods are up and running:

# kubectl get pods --all-namespaces | grep heapster
kube-system heapster-56d577b559-dnjvn 2/2 Running 0 26m
kube-system heapster-v1.4.3-6947497b4-jrczl 3/3 Running 0 5d

Assuming we continue right after the Getting Ready section, we will have two my-nginx Pods running in our cluster:

# kubectl get pods
NAME READY STATUS RESTARTS AGE
my-nginx-6484b5fc4c-9v7dc 1/1 Running 0 40m
my-nginx-6484b5fc4c-krd7p 1/1 Running 0 40m

Then, we can use the kubectl autoscale command to deploy an HPA:

# kubectl autoscale deployment my-nginx --cpu-percent=50 --min=2 --max=5 
deployment "my-nginx" autoscaled 
# cat 3-1-2_hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-nginx
spec:
  scaleTargetRef:
    kind: Deployment
    name: my-nginx
  minReplicas: 2
  maxReplicas: 5
  targetCPUUtilizationPercentage: 50

To check if it's running as expected:

// check horizontal pod autoscaler (HPA)
# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx <unknown> / 50% 2 5 0 3s

We find the target shows as unknown and replicas are 0. Why is this? the runs as a control loop, at a default interval of 30 seconds. There might be a delay before it reflects the real metrics.

The default sync period of an HPA can be altered by changing the following parameter in control manager: --horizontal-pod-autoscaler-sync-period.

After waiting a couple of seconds, we will find the current metrics are there now. The number showed in the target column presents (current / target). It means the load is currently 0%, and scale target is 50%:

# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx 0% / 50% 2 5 2 48m

// check details of a hpa
# kubectl describe hpa my-nginx
Name: my-nginx
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 15 Jan 2018 22:48:28 -0500
Reference: Deployment/my-nginx
Metrics: ( current / target )
  resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 2
Max replicas: 5

To test if HPA can scale the Pod properly, we'll manually generate some loads to my-nginx service:

// generate the load
# kubectl run -it --rm --restart=Never <pod_name> --image=busybox -- sh -c "while true; do wget -O - -q http://my-nginx; done"

In the preceding command, we ran a busybox image which allowed us to run a simple command on it. We used the –c parameter to specify the default command, which is an infinite loop, to query my-nginx service.

After about one minute, you can see that the current value is changing:

// check current value – it's 43% now. not exceeding scaling condition yet.
# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx 43% / 50% 2 5 2 56m

With the same command, we can run more loads with different Pod names repeatedly. Finally, we see that the condition has been met. It's scaling up to 3 replicas, and up to 4 replicas afterwards:

# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx 73% / 50% 2 5 3 1h

# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx 87% / 50% 2 5 4 15m
Keeping observing it and deleting some busybox we deployed. It will eventually cool down and scale down without manual operation involved.
# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-nginx Deployment/my-nginx 40% / 50% 2 5 2 27m

We can see that HPA just scaled our Pods from 4 to 2.

Table of Contents for Horizontal Pod Autoscaler (HPA)

Create new playlist

Sign In

Sign Up

Table of Contents for
Horizontal Pod Autoscaler (HPA)