Provisioners

Provisioners in Terraform are configuration blocks available for several resources that allow you to perform some actions after the resource was created. It is mostly useful for servers, such as EC2 instances. Provisioners can be used to bootstrap the instance, to connect it to a cluster or to run tests, as we did manually in the previous section. There are following four types of provisioners:

  • local-exec
  • remote-exec
  • file
  • chef

Each of them has its own applications and is useful for solving one issue or another. Let's take local-exec first and see how it can help in building inventory files for Ansible.

Provisioning with local-exec and Ansible

Ansible is one of the newest and hottest open source configuration management tools. It has become increasingly popular due to its ease of use. One of the main differences with Chef and Puppet is the lack of any agent installed on the machine to configure. The only requirement is that the machine has Python preinstalled, which is most often the case anyway.

Ansible is executed via SSH. You can run ad-hoc commands to execute trivial scripts on servers, as well as apply Ansible playbooks - definitions of server configuration written in YAML format, analogous to Chef cookbooks and Puppet modules. Ansible needs an inventory file to be able to run a playbook or ad-hoc commands. Inventory is just a text file with a hostname on each line and optional grouping of hosts by putting a group name in square brackets just preceding the hostnames in this group. Let's use Terraform to add an entry to the inventory file after the new EC2 instance is provisioned.

First, make sure you have Python and pip installed. Depending on the operating system it could be either already installed or available for installation via system package manager.

Once pip is available, installing Ansible is a single command away:

$> pip install ansible

Now let's add a provisioner to ./modules/application/application.tf:

resource "aws_instance" "app-server" { 
  ami = "${data.aws_ami.app-ami.id}" 
  instance_type = "${lookup(var.instance_type, var.environment)}" 
  subnet_id = "${var.subnet_id}" 
  vpc_security_group_ids = ["${concat(var.extra_sgs, aws_security_group.allow_http.*.id)}"] 
  user_data = "${data.template_file.user_data.rendered}" 
  key_name = "${var.keypair}" 
  provisioner "local-exec" { 
    command = "echo ${self.public_ip} >> inventory" 
  } 
  tags { 
    Name = "${var.name}" 
  } 
} 

Inside provisioners (and only inside provisioners) we can use a special keyword self to access attributes of a resource being provisioned. The command of provisioner is executed relative to the folder you are running Terraform from.

Apply the template. Notice that provisioners will run only once, after resource creation. None of the updates will re-trigger provisioning, so if you had your stack created before, then destroy it and start creation again. Or use your knowledge about terraform taint command to re-create only the EC2 instance.

After you have the public IP address in an inventory file, you can try running Ansible:

$> ansible all -i inventory -a "cat /etc/redhat-release" -u centos
35.156.10.103 | SUCCESS | rc=0 >>
CentOS Linux release 7.2.1511 (Core) 

It might take a few minutes for AWS to upload the public key to the instance. Don't be surprised if it won't work from the first attempt.

Clearly, this implementation doesn't scale very well. Sooner or later we will have multiple machines having different roles: application servers, database servers, and so on. One option would be to create an inventory file beforehand with host groups predefined inside, as follows:

[app-server]
[db-server]

Then, we could extend local-exec provisioner to be a little bit smarter:

  provisioner "local-exec" { 
    command = "sed -i '/\[app-server\]/a ${self.public_ip}' inventory" 
  } 

With some sed magic, we will add the application server to the [app-server] host group. This will allow us to write more granular Ansible code. Note double slashes: we need to escape brackets both for sed and for Terraform.

The process of creating and provisioning the complete infrastructure could look as follows:

  • Run the Terraform template to create servers and populate the inventory file
  • Run the Ansible playbook to configure all instances in all groups

This approach has some flaws, though. For example, it doesn't handle deleted servers: if the instance is gone from the Terraform template, it will still exist in the inventory file generated previously, leading to failed Ansible runs. We also don't have any indicator of SSH being ready: if we stay with this approach, we have to basically guess if than instance is ready to be SSHed to. Not good!

It would be nice if Terraform had built-in Ansible support, but it's not the case yet. However, there is a small utility named terraform-inventory that generates dynamic Ansible inventory from the Terraform state file. It is available for download on GitHub at https://github.com/adammck/terraform-inventory. You need to use it after running Terraform, as follows:

$> ansible all -i ~/bin/terraform-inventory -a "cat /etc/redhat-release" -u centos

Note

Besides static text file inventories, Ansible has dynamic inventories, meaning that you can pass a script that generates an inventory on the fly. That's why the command mentioned previously works.

It's important to understand that Terraform is a tool that focuses on doing exactly one job and quite often you will need to add some extra tools that either support Terraform or extend its area of usage. Luckily, there are external programs that can make a life with Terraform a little bit easier, like the Terraform inventory we just saw. We will cover few other tools a bit later.

local-exec provisioner is a powerful way to trigger some scripts from the machine that is running Terraform commands. Earlier, we used outputs to pass an IP address to Inspec. With local-exec, we could remove the manual step of running the inspec command: just put this command inside the provisioner and pass the IP by interpolating the aws_instance attribute. This sounds like a great exercise for you! After you are done with it, proceed further.

Although there is no built-in Ansible support, there is Chef support that works out of the box. Let's take a quick look at it.

Provisioning with Chef

Chef is a much older, mature solution to configure management. Unlike Ansible, it does require an installation of an agent on each server, named chef-client. Also, unlike Ansible, it has a Chef server that each client pulls configuration from. We will not install the complete Chef server because doing so could take up the rest of the chapter. If you know and use Chef, then keep reading. If you don't, skip to the next section.

There are two places where Chef APIs are used in Terraform:

  • A Chef provisioner
  • A Chef provider

A Chef provisioner allows you to specify all the details to connect to a Chef server, an initial set of attributes, and the run list. Once instance is created, Terraform will SSH into it, install chef-client, and try to register it with Chef server  using configuration you provided in your template:

 provisioner "chef"  { 
     run_list = ["cookbook::recipe"] 
     node_name = "app-server-1" 
     server_url = "https://chef.internal/organizations/my_company" 
     recreate_client = true 
     user_name = "packt" 
     user_key = "${file("packt.pem")}" 
  } 

The recreate_client option is important: without it, you can't reregister server in case it had to be recreated. There are many more parameters you can configure for a Chef provisioner. For full reference, you should consult Terraform documentation.

The other part, Chef provider, allows you to create various entities on a Chef server. You could create a node with a Chef provider as well, but it's not recommended because it doesn't actually install chef-client anywhere. You could use the provider to store and update all your Chef roles as code:

resource "chef_role" "app-server" { 
  name = "app-server" 
  run_list = ["recipe[nginx]"] 
} 

There are two things to consider before using Terraform with Chef, though:

  • First, Chef itself already has the capabilities to store all of its resources as code, in dedicated Chef repo. Normally, all roles, data bags, and so on are already stored in version control system in a plan JSON format and using Terraform for this purpose has little to no benefit.
  • Second, if you use Chef heavily, then you might not need Terraform at all. Chef already has chef-provisioning component that solves exactly the same problem as Terraform: allows you to define your infrastructure in a single template. It has few extra benefits, like being platform agnostic for base resources: servers, networks, and others. It has downsides as well: it is not as actively developed as Terraform (and looks more and more like a deprecated project) and list of supported providers is not that big. But if you use AWS and Chef, then bringing Terraform into the picture might not be the best decision to make.

In order for Terraform to use Chef provisioner, it has to have SSH access to the server. This SSH access doesn't have to be used only to set up Chef, though. That brings us to the remote-exec provisioner.

Provisioning with remote-exec and Puppet

Each provisioner that needs to connect to the instance has a connection block defined. This block is responsible for either SSH or WinRM configuration so that Terraform knows how, with which user, password (or key) to connect to the server. You can execute any scripts on the target server via this connection with the help of the remote-exec provisioner.

There is a built-in Chef provisioner, and it's rather easy to use Ansible (because it doesn't need anything installed on target system), but what about Puppet? It works very similar to Chef, with the same server-client model, and it requires Puppet Agent to be installed. Let's do it with remote-exec.

Note

We could put its installation into the cloud-init script, but then we wouldn't be able to use it for the remote-exec demonstration.

First, prepare modules/application/application.tf for remote-exec: remove any local-exec and Chef provisioners you've added before.

Then, let's configure a connection block. The default username to SSH into Centos 7 AMI is "centos", and we need to specify it in the template:

  provisioner "remote-exec" { 
    connection { 
      user = "centos" 
    } 
  } 

This should be enough to get going, but if you are using a false private key, then you need to specify it in the same block:

  provisioner "remote-exec" { 
    connection { 
      user = "centos" 
      private_key = "${file("/home/johndoe/.ssh/my_private_key.pem")}" 
    } 
  } 

Note

You could also use SSH-agent with the boolean agent parameter. In a team, you don't want to hardcode the path to your private key; this path is different for each of your collegues. Using SSH-agent solves this problem.

Sometimes, your servers are not publicly available via SSH. Your database server, for example, is probably inside private subnet, and in order to access it, you need to use so-called bastion host. Terraform got you covered here as well, with bastion configuration options:

    connection { 
      user = "centos" 
      agent = true 
      bastion_host = "my_bastion.com" 
      bastion_user = "centos" 
      bastion_private_key = "${file("/home/johndoe/.ssh/bastion_key.pem")}" 
    } 

There is the same wide set of configuration options for WinRM connections, although bastion host can be configured only for SSH connections.

There are three ways to provide a script: inline in the template, by specifying path to the script, or by specifying the whole array of paths to different scripts that will be executed in the order you provide them. Let's keep it simple and provide the script inline first:

  provisioner "remote-exec" { 
    connection { 
      user = "centos" 
    } 
    inline = [ 
      "sudo rpm -ivh http://yum.puppetlabs.com/puppetlabs-release-el-7.noarch.rpm", 
      "sudo yum install puppet -y" 
    ] 
  } 

This will install the Puppet yum repository and Puppet itself. It's a bit silly to install Puppet and not to use it, though. We need to configure something with it.

Although scalable secure Puppet environments often assume Puppet Master is in place, it's easy to use Puppet in masterless mode as well. To do so, we can use the puppet apply command that requires a manifest file (configuration description written in Puppet language). But there is no manifest file on the server! We need to put it there somehow. File provisioner will help us with that.

Uploading files with a file provisioner

A file provisioner simply uploads a file to the server. It's a perfect way to upload configuration files, certificates, and so on. Create a new file named setup.pp in the ./modules/application/ folder with the following content:

host { 'repository': 
  ip => '10.24.45.127', 
} 

Puppet's host resource will add a host entry on the machine. Normally, we should not hardcode host entries on the machine. However sometimes, it doesn't have an access to DNS server yet, but it already needs to install some packages from an internal repository. That's the use case our manifest will cover. We just need to upload it.

Because we will end up with two different provisioners: file and remote-exec, we should move connection block outside the remote-exec provisioner and define it on a resource level. The file provisioner is simple: we only need to specify the source file and destination:

 resource "aws_instance" "app-server" { 
  ami = "${data.aws_ami.app-ami.id}" 
  instance_type = "${lookup(var.instance_type, var.environment)}" 
  subnet_id = "${var.subnet_id}" 
  vpc_security_group_ids = ["${concat(var.extra_sgs, aws_security_group.allow_http.*.id)}"] 
  user_data = "${data.template_file.user_data.rendered}" 
  key_name = "${var.keypair}" 
  connection { 
    user = "centos" 
  } 
  provisioner "file" { 
    source = "${path.module}/setup.pp" 
    destination = "/tmp/setup.pp" 
  } 
  provisioner "remote-exec" { 
    inline = [ 
      "sudo rpm -ivh http://yum.puppetlabs.com/puppetlabs-release-el-7.noarch.rpm", 
      "sudo yum install puppet -y", 
      "sudo puppet apply /tmp/setup.pp" 
    ] 
  } 
  tags { 
    Name = "${var.name}" 
  } 
} 

Destroy the template or taint the aws_instance resource and apply it again to rerun provisioners. Terraform will output everything from scripts to the console, so you should see something following:

module.mighty_trousers.aws_instance.app-server (remote-exec): Complete!
module.mighty_trousers.aws_instance.app-server (remote-exec): 
Notice: Compiled catalog for ip-10-0-1-127.eu-central-1.compute.internal 
in environment production in 0.07 seconds
module.mighty_trousers.aws_instance.app-server (remote-exec): Notice: /Stage[main]/Main/Host[repository]/ensure: created
module.mighty_trousers.aws_instance.app-server (remote-exec): 
Notice: Finished catalog run in 0.02 seconds

Even without the built-in support for Puppet, it appears to be relatively simple to use it with Terraform. Perhaps a bit more complicated in a server-client setup (you need to handle node registration properly), but still nothing too complicated. That's the flexibility remote-exec brings.

One question that could be raised is: why would I use provisioners instead of cloud-init? This is a valid question, and there is exactly one big reason to use provisioners: dependency management inside Terraform. If you use cloud-init, then there is no way to order the creation of different resources inside the Terraform template, simply because Terraform has no idea when cloud-init has finished its job. And that's a problem if you have some kind of master that should exist before every slave node is provisioned because slave needs master to exist. With provisioners, it's not a problem: the resource is not considered as created till provisioning is finished. This means that if resource A depends on resource B, Terraform won't start creating it till all provisioners of resource B are finished.

So far, all provisioners you have learned are meant to be used with one resource and all of them are impossible to rerun without recreating the resource it provisions. In this situation, null_resource is our friend.

Note

While this book is about Terraform, you should always know your options. Similar to Chef, Puppet has built-in ways to do the same job as Terraform: to describe the infrastructure in a single template. In case of Puppet, it has its own powerful language that allows Puppet modules such as puppetlabs-aws (https://github.com/puppetlabs/puppetlabs-aws) to describe all of your cloud resources in an idempotent way. Again, if you are a heavy Puppet user, consider the features it has before bringing extra tools such as Terraform into your company.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.114.38