Canary deployments

Kubernetes supports blue/green deployments, and they are very useful when a rollback is required. Canary deployments are referred to incremental rollouts in which the new version of the application is gradually deployed while getting a small portion of the traffic, or only a subset of live users is connected to the new version.

The previous section on identity-based routing is an example of routing only for a subset of users to the new version. It can be argued that Kubernetes already supports canary deployment, so why is there a need for Istio's canary deployment?

Let's understand this with an example.

Let's assume that we only have two versions of the reviews service, v1 and v2, and that the reviews service endpoints were directed toward both versions of the pods. Without an Istio virtual service in place, Kubernetes will round-robin 50% of the traffic to each of the two pods. Let's assume that v2 is the new version and that as soon as it is deployed, it is getting 50% of the traffic.

If we wanted to divert 90% of the traffic to old version v1 and allow only 10% of the traffic to v2, we could have scaled v1 to nine replicas and kept v2 to a single replica. This would have allowed Kubernetes to direct 90% of the traffic to v1 and 10% to v2.

Istio goes much further than what Kubernetes provides. Using Istio, we can separate traffic routing from replica deployment, where both are unrelated. For example, by using a single replica of v1 and v2, it is possible to divert 90% of the traffic to v1 and 10% of the traffic to v2, independent of scaling both versions. We could run four replicas of v1 and route only 20% of the traffic to it but route 80% of the traffic to a canary v2 version with just one replica.

Let's see this through an implementation example:

Consider that reviews:v1 is the production version and that we are deploying reviews:v2, which hasn't been fully tested. We only want to route 10% of traffic to it, without increasing or decreasing the replica sets:

# Script: 06-canary-deployment-weight-based-routing.yaml

kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10

Regarding the deployment rules for the reviews virtual service, we have assigned 90% weight to the v1 subset and 10% to the v2 subset.

Modify the reviews virtual service with weight-based routing:

$ kubectl -n istio-lab apply -f 06-canary-deployment-weight-based-routing.yaml
virtualservice.networking.istio.io/reviews configured

Go back to the browser and hit refresh multiple times. You will notice that, the majority of times, it shows no stars (reviews:v1) and that, occasionally, it shows black stars (reviews:v2). If you look at the HTML source when it shows black stars, you will notice that it has two HTML comments that have the text full stars when the traffic is sent to v2 of reviews.

Run the curl command on the productpage 1,000 times and count the "full stars" HTML comment to make an estimate of the percentage of traffic that's routed between two versions of the same reviews pod. This will take some time to complete:

$ echo $INGRESS_HOST
192.168.142.249

$ time curl -s http://$INGRESS_HOST/productpage?[1-1000] | grep -c "full stars"
204

real 0m42.698s
user 0m0.032s
sys 0m0.343s

Note: Make sure that $INGRESS_HOST is populated properly with the IP address of the load balancer IP address of your environment. Run the following command to find out the external IP address: kubectl -n istio-system get svc istio-ingressgateway.

If we divide 204 by 2 (since each output of the curl command from reviews:v2 contains the string "full stars" twice), it is close to 10% (102/1,000=10% approx) of the traffic that was sent to reviews:v2 through weight-based routing using the canary deployment capabilities of Istio.

Notice that without using a scaling of pods, we were able to divert 10% of the traffic to the canary release (reviews:v2). This was possible due to Pilot pushing the configuration to the Envoy sidecar proxy and because load balancing is done at Layer 7.

Let's assume that you are now satisfied with the canary deployment and want to shut down v1 and make v2 part of the production release.

Modify the reviews virtual service and apply the rule:

# Script : 07-move-canary-to-production.yaml 

kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v2

In the preceding code, we removed the route to the v1 subset and removed the weight from the v2 subset to route 100% of the traffic to v2, hence making it the new production release.

Apply the new rule by modifying the reviews virtual service:

$ kubectl -n istio-lab apply -f 07-move-canary-to-production.yaml
virtualservice.networking.istio.io/reviews configured

Now, repeat the same test again:

$ curl -s http://$INGRESS_HOST/productpage?[1-1000] | grep -c "full stars"
2000

Since each HTML page has two occurrences of "full stars", we get 2000 counts from the 1,000 requests we sent using the preceding curl command. This shows that our canary deployment is now a new production. If necessary, we can take down v1 of reviews.

It is important to note that the preceding capabilities are available without making changes to the application code, without taking an outage, and, more importantly, without having to change the number of replica sets.

Now that we've learned about traffic shifting features, we will explore fault injection and timeout features.

Table of Contents for Canary deployments

Create new playlist

Sign In

Sign Up

Table of Contents for
Canary deployments