Planning for the worse – train to rebuild working systems

It's one thing to get a full infrastructure finally managed by Chef—block by block, weeks after weeks, modification after modification—keeping the Chef run always smooth and working. However, it's something quite different to be able to rebootstrap a working system from scratch. What if the current setup that works perfectly well is in fact working because there's a script or a binary somewhere left from last year, which does the thing that makes it work? What if the application servers get corrupted tonight? If this happens, will we be able to rebuild it from scratch? If tomorrow our IaaS cloud provider crashes, in what timeframe will we be able to rebuild systems somewhere else (provided the backups are working; well, that's another story)?

Now our systems are as much as possible automated, hopefully 100 percent. It's important to know whether we'd be able to fully rebootstrap these systems in case of a disaster; if yes, how long it would take. You may be surprised when you collect some data and discover that many systems can be recovered in minutes. Compare this with the time it might take to find an outdated documentation, apply untested manual processes, and finally do whatever it takes to get something up and running under the pressure of an emergency. We'll all spend better nights and weekends if we know that all the system profiles are being continuously rebootstrapped successfully; in fact, why not use the CI system every night so every morning we would know whether the previous day's changes have impacted something. We, as a team, always know that we're ready to redeploy a system if required.

Getting ready

To step through this recipe, you will need:

  • A working Chef DK installation on the workstation
  • A working Vagrant installation on the workstation
  • The Chef code (optionally) from Chapter 6, Fundamentals of Managing Servers with Chef and Puppet, Chapter 7, Testing and Writing Better Infrastructure Code with Chef and Puppet, or any custom Chef code

How to do it…

There is no single way to achieve our goal. We've already covered Test Kitchen, and this might be a good solution, especially if we have written extensive tests. Integrate this in the company's Continuous Integration (CI) system and this will do the job.

A simpler and quicker solution can also be to just launch Vagrant boxes with the right Chef-provisioning profiles for each use case: docker, webserver, database server, or full deployment.

Note

Refer to the Vagrant chapter of this book for more information about the Vagrant tool!

Our production servers are configured by the application of some Chef code, and currently, it does this job pretty well. Are we able to easily rebootstrap a similar CentOS 7.2 server from scratch to the point that it is similarly installed without any Chef or system error? Let's find this out by including Vagrantfile at the root of the infrastructure repository, using the previous project code for deploying Docker (but the idea is the same for any kind of Chef repo). The minimum we can do is boot a fresh CentOS 7.2:

Vagrant.configure("2") do |config|
  config.vm.box = "bento/centos-7.2"
end

We'd like to automatically install Chef on our temporary node, so let's use the vagrant-omnibus plugin (remember, installing it is easy: vagrant plugin install vagrant-omnibus). Here's the code to do this:

  config.omnibus.chef_version = :latest

Let's configure the Vagrant provisioning system to use Chef Zero in order to simulate a Chef server. We can also directly use a real Chef server; if we have one behind the firewall, it can be handy. We have to specify where is everything placed (cookbooks, environments, roles, and so on) with the added subtlety of a nodes folder that will be left empty in our case. Our virtual machine will run in the production environment and apply the docker role:

  config.vm.provision "chef_zero" do |chef|
    chef.cookbooks_path = "cookbooks"
    chef.environments_path = "environments"
    chef.roles_path = "roles"
    chef.nodes_path = "nodes"
    chef.environment = "production"
    chef.add_role "docker"
  end

We're almost done! We need to tell the Vagrant Berkshelf plugin where to look for Berksfile and whether to enable it (installing the Berkshelf plugin is easy: vagrant plugin install vagrant-berkshelf). Here's the code to do this:

  config.berkshelf.berksfile_path = "cookbooks/platform/Berksfile"
  config.berkshelf.enabled = true

Starting Vagrant at this point will just deploy everything from scratch:

$ vagrant up 
[...]
# Chef Client finished, 17/45 resources updated in 03 minutes 30 seconds

If the run succeeds, meaning the code from the Docker role is applied, we're safe. Let's destroy the VM:

$ vagrant destroy -f

Including this Vagrant command in our CI system will ensure this particular role will run flawlessly in this particular environment and with this particular system, and that potentially, it's a matter of three minutes and 30 seconds to recover from nothing to a working state.

Multi-machine recovery

Let's move to a more complicated setup. Vagrant supports multi-machine setups, letting us define profiles for each one of them. In a previous example of this chapter, we deployed a WordPress installation with a database configured and the Apache web server configured as well, all with encrypted data bags and templates. We'll implement the same idea, except that Vagrantfile will include multiple machine profiles: one to start a virtual machine only with the webserver role, another to deploy only the database part, and the third one to launch everything together, including the web application. So we'll make sure all the parts of the final product can be redeployed from scratch (which is the main point).

All VM definitions will live inside the main Vagrant configuration:

Vagrant.configure('2') do |config|
  config.vm.define 'whatever_vm', autostart: false do |node|
    [...]
  end
end

Note

We suggest disabling the automatic start of VMs so we don't make the mistake of launching dozens of VMs by error.

To make sure our code is capable of bootstrapping only the webserver role from scratch, we will need to do the following—setting paths for everything, including the specific Berksfile for the job:

  config.vm.define 'webserver', autostart: false do |ws|
    ws.vm.box = 'bento/centos-7.2'

    ws.vm.provision :chef_zero do |chef|
      chef.cookbooks_path = 'cookbooks'
      chef.environments_path = 'environments'
      chef.roles_path = 'roles'
      chef.nodes_path = 'nodes'
      chef.environment = 'production'
      chef.add_role 'webserver'
    end

    ws.berkshelf.berksfile_path = 'cookbooks/apache/Berksfile'
    ws.berkshelf.enabled = true
  end

To launch only this box in order to make sure the webserver role can be deployed from scratch, use the following command:

$ vagrant up webserver

To make sure our code is capable of bootstrapping only the database part of this platform from scratch, just execute the mysite::mysql recipe in a similar context:

  config.vm.define 'db', autostart: false do |db|
    db.vm.box = 'bento/centos-7.2'

    db.vm.provision :chef_zero do |chef|
      chef.cookbooks_path = 'cookbooks'
      chef.environments_path = 'environments'
      chef.roles_path = 'roles'
      chef.nodes_path = 'nodes'
      chef.environment = 'production'
      chef.add_recipe 'mysite::mysql'
    end

    db.berkshelf.berksfile_path = 'cookbooks/mysite/Berksfile'
    db.berkshelf.enabled = true
  end

To launch only this box in order to make sure the database recipe can be deployed from scratch, use the following command:

$ vagrant up db
[...]
Chef Client finished, 29/43 resources updated in 01 minutes 28 seconds

To make sure our code is capable of bootstrapping the whole platform from scratch, we'll have to simply execute the whole mysite::default recipe with one more step. One of the included recipes uses an encrypted data bag. It's stored encrypted on the Chef server, but locally, our ./data_bags/ directory currently includes only the unencrypted JSON versions. We have to make sure another folder hosts the encrypted versions (maybe you already have one to store them on GitHub for example). Otherwise, import the encrypted version from the Chef server to a new directory, say, in JSON (using -Fj):

$ mkdir data_bags_encrypted 
$ knife data bag show aws us-east-1 -Fj > data_bags_encrypted/us-east-1.json 

Now we can define the full VM like the others with the modified data bag path for the encrypted version:

  config.vm.define 'mysite', autostart: false do |mysite|
    mysite.vm.box = 'bento/centos-7.2'

    mysite.vm.provision :chef_zero do |chef|
      chef.cookbooks_path = 'cookbooks'
      chef.environments_path = 'environments'
      chef.data_bags_path = 'data_bags_encrypted'
      chef.roles_path = 'roles'
      chef.nodes_path = 'nodes'
      chef.environment = 'production'
      chef.add_recipe 'mysite::default'
    end

    mysite.berkshelf.berksfile_path = 'cookbooks/mysite/Berksfile'
    mysite.berkshelf.enabled = true
  end

To launch only this box in order to make sure the whole recipe is deployed from scratch, use the following command:

$ vagrant up mysite

Put these commands (with their destroy counterparts) in the CI or whatever system you prefer at a regular interval, like daily or weekly, for each and every automated part of the infrastructure. With this, you'll always be certain you can redeploy the system when a disaster comes.

There's more…

Using Puppet, all the examples we used were based on Vagrant, and it is easy to rebuild nodes from scratch. But, in the real word, you probably won't deploy and maintain a production system running from Vagrant on your workstation.

However, these examples show that it is possible to simulate a complete infrastructure using a simple vagrant up command, and therefore, it is easy to put it into any CI system to ensure you will be able to rebuild your production system easily.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.10.246