Managed Instance Groups

One cloud virtual machine instance is fine, but we've seen at the very start of the book that the reasons for switching to the cloud can be summed up in two words: autohealing and autoscaling. One cloud VM instance is not going to provide autohealing or autoscaling, so we need a higher-level abstraction, the Managed Instance Group (MIG).

Understanding this one sentence is really important, so let's parse it carefully.

That fundamentally is what a Managed Instance Group is. Each element in MIG is GCE VM, just like any other GCE VM that you might have spun up. One VM instance is vulnerable; it can crash or be overwhelmed by a spike in client traffic, but a group of VM instances is effectively a cluster and much more robust.

All the VM instances in an MIG are cast from the same mold; that mold is called an instance template. How does an instance template come into existence? In pretty much the same two ways that an individual VM comes into existence:

You can create an instance template by specifying much the same properties that you would while creating an individual VM: the name, machine type, boot disk, and OS image
You can import an instance template from an external image or Docker container

The platform takes responsibility for ensuring that each member of the MIG is running and ready to accept client requests. This is done by associating a health check with the MIG. The health check can be best thought of as a probe or polling program that will keep asking each member of the MIG whether it is healthy or not. This probe will need to be received and understood by the individual VM instances, which means that the protocol and port must be prespecified; the choices of protocol are HTTP(S), TCP, and SSL (TLS).

The health checker will ping each instance at a specified interval (named the check interval) and then wait. If no response is received within another specified interval (named the timeout), the health checker concludes that the VM instance is down. If a specific number of such probes all time out, the MIG will spin up a new instance to replace this sick one. That specified number of failures is called the unhealthy threshold. In the meantime, the service will ping the unhealthy instances as well, hoping that they have come back online. Once a new VM instance comes online, the checker will look for a number of consecutive successful pings (named the healthy threshold) before deciding that it is safe to send traffic to the instance.

Now, if you are paying close attention, you might note that the algorithm does not really sound like autohealing because the service does not actually restart the unhealthy instances, it merely continues polling them, hoping that some engineer has gone in and fixed them. So this, technically, is healing, but not quite autohealing.

If you thought this, well, you're wide awake, and absolutely right. The term autohealing refers specifically to an additional feature (currently in beta) in which the health check will recreate an instance once the health check fails. This automated restart really is autohealing although even the ordinary healing is quite important because it ensures that crashes of individual members of an MIG do not reduce the capacity of the MIG as a whole.

Autoscaling is another important additional feature of MIGs. You can specify a way for the MIG to measure how much load your VM instances are experiencing, and the number of elements in the MIG will go up or down in order keep that load close to a threshold you specify.

So, for instance, you might specify a CPU utilization threshold, say 60%. The service will then measure the average CPU utilization across the MIG, and if that number exceeds 60%, new VM instances will be spun up from the instance template we mentioned earlier. Later, say traffic dies down and the average CPU utilization falls to 55%, the MIG will also scale down by getting rid of some VM instances. This scale-down is graceful; connection draining will ensure that existing requests are serviced even as new ones are not accepted. Over time, those existing requests will be serviced, and once they are all done, the instance can shut down and exit the MIG.

Autoscaling is very fast, policies are checked, and updates to the state of the MIG are made every minute or so.

Another few important points worth keeping in mind: autoscaling policies can focus on CPU utilization, HTTP load balancing requests/second, or Stackdriver metrics. The last bit about Stackdriver is subtle: remember that Stackdriver allows us to create custom metrics, so we effectively can specify an autoscaling policy based on anything we want to measure (we do need to instrument our code and define the custom metric in Stackdriver though).

If you specify multiple policies, the MIG will go with the most liberal policy, that is, it will always to tend to provision the largest number of VM instances that might be needed by your app.

The advantage of autoscaling is obvious: you can scale up when traffic surges, but scale down (and save cost) when traffic falls. Autoscaling is a pretty important feature for any compute service, and indeed, GCE VMs are not alone in offering this functionality. Autoscaling comes with the territory in App Engine, you don't need to do anything to get it going. App Engine Flex is a bit slower to scale up and down than App Engine standard, but they both provide autoscaling. On the GKE, you can use an abstraction called a Horizontal Pod Autoscaler (HPA) to get similar functionality.

Hopefully, the manner in which the platform manages an instance group is clear, and there is a lot of behind-the-scenes processing going on to make sure that the actual and desired states of the MIG are in synch. One last bit, you should be aware that MIGs can be hooked up, pretty much automatically, to load balancers as backend server clusters. This use of MIGs with load balancers is a very important cloud use case.

Table of Contents for Managed Instance Groups

Create new playlist

Sign In

Sign Up

Table of Contents for
Managed Instance Groups