Immutable Infrastructure

When we had bare metal servers, it took a bit of time to provision them. Even today, if you want to get new hardware, it can take days to get them connected and running. Needless to say that you would want to keep them running as long as possible, given the cost of replacement or adding a new one. Then, as automation is a must, instead of configuring this bare metal servers by hand, a set of configuration management tools appeared.

Even with these tools, though, servers are prone to configuration drift, they can diverge a lot from one another and people would still go via SSH and perform changes not captured into infrastructure code.

Immutable Infrastructure

Don't get me wrong: configuration management is still a must. But context changed a bit after virtualization got a huge adoption in the form of Cloud providers. The time required to create a server was cut down to few minutes, instead of hours and days. More importantly, time required to recreate a server is also really low, compared with the bare metal world. So why bother with keeping the existing machine updated, if you could just destroy it and create new one from scratch? That's the basis of a so-called Immutable Infrastructure.

Immutable Infrastructure

What does it give to operators?

  • Little to none configuration drift: Your server is created once and never updated, so you can just take a look at the base image to tell what is (most likely) the state of it.
  • Predictable and simple updates: Each change to configuration is captured and versioned in source control. It is then tested with tools such as Inspec and TestKitchen. Only then it is rolled out to environments, one at a time.

Immutable Infrastructure goes hand in hand with functional programming principles: functional programming languages provide immutable data structures, just like your servers are immutable once they were created.

Netflix is perhaps the brightest example of adoption of this approach. As they are one of the biggest AWS EC2 users, they run all of their workloads in the Cloud on virtual servers. Their processes are well covered in multiple blog posts in the Netflix technology blog, for example AMI Creation with Aminator (http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html). They go really far with creating AMIs: there are multiple layers of them, each subsequent layer being baked from the AMI of previous one. In the following diagram, taken from Netflix blog, you can see what an application server AMI consists of, for example:

Immutable Infrastructure

What becomes clear from reading the Netflix story is that the transition to Immutable Infrastructure and baking a ton of images are not that straightforward. More than anything, Immutable Infrastructure requires a new set of tools and techniques to work well. Baking new images has to be fast, reliable, automated, and should be part of Continuous Integration and Continuous Delivery pipeline. The process of rolling out new images also is different from the traditional configuration management approach. You just wait till chef-client or Puppet agent will rerun in few minutes to apply changes. In fact, an image becomes a new type of software package you develop, and it should be treated accordingly.

Another company that pushes the whole throw away complete server and create a new one approach hard is, don't be too surprised, HashiCorp. As already mentioned, Terraform works with replaceable servers in mind. The way it versions the state file works best with immutable servers. Just think about it for a moment: if we taint the provisioner as we did earlier, what kind of change will be recorded in a version control system? You might see the change to Puppet manifest, of course, but what if manifests and modules are coming from the different location, separated from the Terraform repository? Yes, you would see that null_resource was recreated, but that's about it. What was the reason behind recreation? What's the new state of your infrastructure?

It's a whole different story if you replace the AMI ID. Now you can clearly see that your machines were upgraded from AMI A to AMI B. You already know what changed between these two versions. You still have the full overview of the state and progress of your infrastructure. And look at the aws_ami data resource, built into Terraform - it is perhaps the most (if not only) robust and featureful data resource that Terraform has.

Of course, Immutable Infrastructure is just one way to look at infrastructure management. It's not the only way to do it, but it is certainly a viable alternative to the traditional approach. Lots of hugely successful technology companies are very happy without this approach. Just look at StackOverflow; it has a handful of bare metal servers handling all the production traffic. No VMs, and no constant server replacement.

Note

StackOverflow posts the full description of the state of their infrastructure every now and then. The latest state is documented in an article named Stack Overflow: The Architecture - 2016 Edition. Refer to  http://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/.

There are trade-offs to doing Immutable Infrastructure, such as added complexity to the whole toolset that you have for your operations team. It also can be much, much slower than just using configuration management. Baking an image is slow. Replacing it can also be slow. Certainly, it is really fast if you jump on the containers bandwagon, but this only means that you have to introduce another half a dozen of new technologies to your organization.

None of this changes the fact that Terraform is a tool built with immutability in mind. And it's not the only tool: HashiCorp stack has another utility that can be combined with Terraform in a powerful Immutable Infrastructure combo. This tool is named Packer, and you have to learn a bit of it if we want to master Terraform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.40.177