Scaling

The scale command works the same way as it did in our ReplicationController. To scale up, we simply use the deployment name and specify the new number of replicas, as shown here:

$ kubectl scale deployment node-js-deploy --replicas 3

If all goes well, we'll simply see a message about the deployment being scaled in the output of our Terminal window. We can check the number of running pods using the get pods command from earlier. In the latest versions of Kubernetes, you're also able to set up pod scaling for your cluster, which allows you to do horizontal autoscaling so you can scale up pods based on the CPU utilization of your cluster. You'll need to set a maximum and minimum number of pods in order to get this going.

Here's what that command would look like with this example:

$ kubectl autoscale deployment node-js-deploy --min=25 --max=30 --cpu-percent=75
deployment "node-js-deploy" autoscaled

Read more about horizontal pod scaling in this walkthrough: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/.

There's also a concept of proportional scaling, which allows you to run multiple version of your application at the same time. This implementation would be useful when incrementing a backward-compatible version of an API-based microservice, for example. When doing this type of deployment, you'll use .spec.strategy.rollingUpdate.maxUnavailable and .spec.strategy.rollingUpdate.maxSurge to limit the maximum number of pods that can be down during an update to the deployment, or the maximum number of pods that can be created that exceed the desired number of pods, respectively.

Table of Contents for Scaling

Create new playlist

Sign In

Sign Up

Table of Contents for
Scaling