Chapter 4.  Storing and Supplying Configuration

So far, we know how to codify our infrastructure into Terraform templates of varying sizes. We know how to structure templates and how to split and reuse them with modules. More that this, we already figured out important concepts behind how Terraform works. But there is an important piece we almost didn't look at: configuration.

A template with only hardcoded data in it is a bad template. You can't reuse it in other projects without modifying it. You always have to update it by hand if some value changes. And you have to store a lot of information that doesn't really belong to the infrastructure template.

In this chapter, we will learn how to make Terraform templates more configurable. First, we will take a lookup variables and all the possible ways to use them. Then, we will learn how to use data resources to retrieve information from outside Terraform template. Finally, we will use built-in provider's capabilities to generate random data, secrets, and config files. As a bonus, we will also take a very quick look at another HashiCorp tool: Consul.

Understanding variables

If you've ever used any programming language, then you might be familiar with variables already. In most common case, they allow you to assign a value (a number or string or something else) to some hand-picked name and reference this value by this name inside your code. If you need to modify the value, then you just need to do it once, in a place where variable is defined.

Unlike in programming languages, variables in Terraform are more like input data for your templates: you define them before using the template. During the Terraform run, you have zero control over variables. The values of variables never change; you can't modify them inside the template.

In the previous chapter, we already tried variables in order to configure modules. We also learned that our template.tf is a module: root module. Let's define some variables for the root module.

It is a common pattern to split variables, template, and outputs into three different files. As you might remember, Terraform loads all files with the .tf extension from the current folder, so you don't need to do any extra steps to join these three files. Let's create a new file variables.tf with the following content:

variable "region" {}  

Now let's use it inside template.tf to configure AWS provider:

provider "aws" { 
  region = "${var.region}" 
} 

If we would try to apply or plan template now, Terraform would interactively ask us for the value of this variable:

$> terraform plan 
var.region 
  Enter a value: 

That's nice and sometimes convenient, but in most cases, you don't want to type values of all variables every time. Not only it's inconvenient, but it could also be dangerous: mistyping a variable value can lead to terrible consequences. If one of the variables is used inside a resource parameter that causes its recreation, then mistyping it will lead to accidental removal of the resource. Don't rely on the manual input of configuration data.

We could reduce the chances of accidental infrastructure destruction by adding a description to the variable:

variable "region" {  
  description = "AWS region. Changing it will lead to loss of complete stack." 
} 

Now the user of the template will see this description when he or she tries to apply it:

$> terraform plan
var.region
  AWS region. Changing it will lead to loss of complete stack.
  Enter a value: 

It doesn't save us from typos though. What would be an even more reliable way to protect the infrastructure from human mistakes is to have a default value for the variable:

variable "region" {  
  description = "AWS region. Changing it will lead to loss of complete stack." 
  default = "eu-central-1" 
} 

With the default value in place, Terraform won't ask for the value interactively anymore. It will pick default value unless other sources of variables are present.

There are three types of variables you can set:

  • the string variables (default ones)
  • the map variables
  • the list variables

You can set interactively only the string variables; for map and list, you have to use other methods, which we will take a look at a bit later.

Using map variables

If you've used maps, dictionaries, or hashes in some programming language such as Ruby, then you know what Map in Terraform is. Map is a lookup table, where you specify multiple keys with different values. You can then pick the value depending on the key. It's easier to understand it with the example.

At the moment our MightyTrousers application always uses the t2.micro instance type. These are cheap instances that are good for quick tests and development, but they are not that great for production. What we want, actually, is a way to use different instance types depending on the environment stack is deployed to. Let's assume that we have only three environments: dev, prod, and test.

First, let's move variables out of the  modules/application/application.tf file to modules/application/variables.tf. And then let's define two new variables there: environment and instance_type.

variable "environment" { default = "dev" } 
variable "instance_type" { 
   type = "map" 
   default = { 
     dev = "t2.micro" 
     test = "t2.medium" 
     prod = "t2.large" 
   } 
} 

We specified type explicitly, even though it's not really required when we have the  default value as well. The default type is string.

What you also need to do is to add variable "environment"{default = "prod" } to the variables.tf file in the root folder of our project. We will use prod on top level to show that root module variable value will override the default of module itself.

Then, modify the module to look as follows:

module "mighty_trousers" { 
  source = "./modules/application" 
  vpc_id = "${aws_vpc.my_vpc.id}" 
  subnet_id = "${aws_subnet.public.id}" 
  name = "MightyTrousers" 
  environment = "${var.environment}" 
} 

Here, we pass a variable from the root module to the application module. We don't need to pass the  instance_type variable because we will just look at the value we need from the existing variable. To do this, Terraform provides the  lookup() interpolation function. This function accepts map as the first argument, the key to look for in this map as the second argument, and an optional default value as the third argument. Let's modify our modules/application/application.tf aws_instance resource to look as follows:

resource "aws_instance" "app-server" { 
  ami = "ami-9bf712f4" 
  instance_type = "${lookup(var.instance_type, var.environment)}" 
  subnet_id = "${var.subnet_id}" 
  vpc_security_group_ids = ["${aws_security_group.allow_http.id}"] 
  tags { 
    Name = "${var.name}" 
  } 
} 

We did not specify the default value inside the lookup() function; there is already a default on both module and root levels. Let's run the terraform plan command to see which parameters the instance would get:

$> terraform plan
< .... >
+ module.mighty_trousers.aws_instance.app-server
    ami:                      "ami-378f925b"
    availability_zone:        "<computed>"
    ebs_block_device.#:       "<computed>"
    ephemeral_block_device.#: "<computed>"
    instance_state:           "<computed>"
    instance_type:            "t2.large"
    key_name:                 "<computed>"
    network_interface_id:     "<computed>"
    placement_group:          "<computed>"
    private_dns:              "<computed>"
    private_ip:               "<computed>"
    public_dns:               "<computed>"
    public_ip:                "<computed>"
    root_block_device.#:      "<computed>"
    security_groups.#:        "<computed>"
    source_dest_check:        "true"
    subnet_id:                "${var.subnet_id}"
    tags.%:                   "1"
    tags.Name:                "MightyTrousers"
    tenancy:                  "<computed>"
    vpc_security_group_ids.#: "<computed>"

Indeed it took the  t2.large instance type. Maps give you more flexibility compared with regular string variables. So create lists.

Using list variables

Continuing analogy with programming, list in Terraform is similar to arrays in most programming languages. There is a very nice place where we can use lists in our templates: security group.

Currently, an application module defines a single security group and assigns it to the instance. But an EC2 instance can have multiple security groups attached. We could have a default security group that allows an SSH access and then on an application level we have another one for app-specific permissions.

Let's add yet another variable to module/application/variables.tf, with an empty list as a default value:

variable "extra_sgs" { default = [] } 

Now, let's define a default security group in template.tf with a SSH access allowed:

resource "aws_security_group" "default" { 
  name = "Default SG" 
  description = "Allow SSH access" 
  vpc_id = "${aws_vpc.my_vpc.id}" 
 
  ingress { 
    from_port = 22 
    to_port = 22 
    protocol = "tcp" 
    cidr_blocks = ["0.0.0.0/0"] 
  } 
} 
 

Now we can pass it to the module, wrapping it with square brackets (which means it's a list):

module "mighty_trousers" { 
  source = "./modules/application" 
  vpc_id = "${aws_vpc.my_vpc.id}" 
  subnet_id = "${aws_subnet.public.id}" 
  name = "MightyTrousers" 
  environment = "${var.environment}" 
  extra_sgs = ["${aws_security_group.default.id}"] 
} 

Now we only need to use this extra security group IDs together with an app-specific security group. To achieve this, we will use the  concat() interpolation function. This function joins multiple lists into one. We also better ensure that the resulting list doesn't have duplicates. The  distinct() function will help with this; it removes all the duplicates, keeping only the first occurrence of each non-unique element. We will join the extra_sgs list with a list made from an app-specific SG defined in application.tf:

resource "aws_instance" "app-server" { 
  ami = "ami-9bf712f4" 
  instance_type = "${lookup(var.instance_type, var.environment)}" 
  subnet_id = "${var.subnet_id}" 
  vpc_security_group_ids = ["${distinct(concat(var.extra_sgs, aws_security_group.allow_http.*.id))}"] 
  tags { 
    Name = "${var.name}" 
  } 
} 

The syntax might not look obvious here, especially if you are coming from programming background. It takes some time to get used to peculiarities of Terraform DSL. One would expect that because we have a single app-specific security group, we would simply wrap it with square brackets as follows:

["${concat(var.extra_sgs, [aws_security_group.allow_http.id]}"] 

Unfortunately, it doesn't work like this. Internally, we defined that aws_security_group is not a single resource, but a list consists of a single resource. Terraform doesn't have loops. Instead, it has a special syntax to iterate over the multiple resources with the  * symbol. In the background, we have the following:

aws_security_group.allow_http.*.id  

Terraform transforms the preceding to something similar to the following:

[aws_security_group.allow_http.0.id] 

Let's say we would have multiple groups that would result in the following:

[aws_security_group.allow_http.0.id .. aws_security_group.allow_http.N.id] 

Here, N is the number of groups. We didn't discuss how to create multiple instances of the same resource yet, though. That's the topic for another chapter.

Note

Terraform language can be very confusing at times. Since version 0.8.0, there is a terraform console command, which allows you to play around with different interpolation functions and other features in an interactive console. The console itself is quite unpredictable as well, but you should expect it to become more useful over time.

Both map and list allow building complex though sometimes not obvious constructions around Terraform variables via various interpolation function's usage. But so far, we still defined our variables only via default values. It's time to figure out how to do it differently. First, let's learn how to provide variable values inline with Terraform commands invocation.

Supplying variables inline

The easiest (after interactive mode) way to set variable values is to specify them as an argument to Terraform command. It's done with the multiple -var arguments to the command with the name and value of the variable following:

$> terraform plan -var 'environment=dev'

Note how instance type of the EC2 server is different because we set the variable environment to dev.

So far, we don't have any map or list variables for the root module. Let's add a list of CIDR blocks that are allowed to access default security group via SSH. Also, let's add  map with CIDR blocks for our subnets. We will have two blocks: for private and for public subnets accordingly. In the end, variables.tf should look as follows:

variable "region" { 
  description = "AWS region. Changing it will lead to loss of complete stack." 
  default = "eu-central-1" 
} 
variable "environment" { default = "prod" } 
variable "allow_ssh_access" { 
  description = "List of CIDR blocks that can access instances via SSH" 
  default = ["0.0.0.0/0"] 
} 
variable "vpc_cidr" { default = "10.0.0.0/16" }  
variable "subnet_cidrs" { 
  description = "CIDR blocks for public and private subnets" 
  default = { 
    public = "10.0.1.0/24" 
    private = "10.0.2.0/24" 
  } 
} 

As an exercise, make use of these new variables yourself with the help of the  lookup() function.

If we would try to supply the allow_ssh_access variable via command line, it could look like this:

$> terraform plan -var 'allow_ssh_access=["52.123.123.123/32"]'

If we would need to change CIDR blocks' map, then we could do it as follows:

$> terraform plan -var 'subnet_cidrs={public = "172.0.16.0/24", private = "172.0.17.0/24"}'

Setting variables via CLI arguments can be useful sometimes: to provide a password for some service or to tweak some values for development purposes. But it is in no way a reliable and production-ready storage. There is a better option: environment variables.

Using Terraform environment variables

The third way (after interactive input and inline arguments) to supply values to your variables is to use environment variables.

Note

The environment variables are part of the environment where process is running and the program can access them. There are always some environment variables already set; for example, $PATH defines paths where your shell will look for executables. You can get a list of currently set environment variables with the  env command on *nix operating systems.

Terraform will automatically read all environment variables with the TF_VAR_ prefix. For example, to set value for the region variable, you would need to set the  TF_VAR_region environment variable.

There are multiple ways to set environment variables. You could do it inline with your terraform command execution, as follows:

$> TF_VAR_region=eu-central-1 terraform plan

But that's not much different from setting variables with the -var argument. Alternatively, you could set them once in your terminal:

$> export TF_VAR_subnet_cidrs='{public = "172.0.16.0/24", private = "172.0.17.0/24"}'
$> terraform plan

This would set the variable value for the current terminal session. You could unset it with the unset command:

$> unset TF_VAR_subnet_cidrs

It might be tempting to set all variables with export, but it can cause some problems: you don't have an easy overview of which variable has which value.

Note

A quick way to check which terraform variables are defined via environment variables on *nix operating systems is to run env | grep "TF_VAR".

Of course, you could store your environment variables inside a text file and source it. Create a file vars with the following contents:

export TF_VAR_subnet_cidrs='{public = "172.0.16.0/24", private = "172.0.17.0/24"}' 
export TF_VAR_region=eu-central-1 

Then, source this file to make it available in your environment:

$> source vars
$> terraform plan

This approach still doesn't prevent you from reassigning an environment variable by accident. Once again, environment variables are good for development, but it's not the best solution yet. An even way is to use variable files.

Using variable files

When running Terraform commands, you can optionally supply a variable file via the -var-file argument. The syntax of these files is the good old HCL, familiar to you from Terraform templates themselves. Create a new file named development.tfvars and set your variables there:

region = "eu-central-1" 
vpc_cidr = "172.0.0.0/16" 
subnet_cidrs = { 
  public = "172.0.16.0/24" 
  private = "172.0.17.0/24" 
} 

To use it, run terraform plan command with -var-file argument:

$> terraform plan -var-file=./development.tfvars 

It's much more reliable to use variable files for production stacks: you always know which values are there and you can store them in version control. And for sensitive things, such as a personal password to access the cloud account, you would still use environment variables or inline arguments.

Note

In Chapter 7, Collaborative Infrastructure, we will discuss better ways to deal with sensitive data.

To go one step even further, you could remove all defaults from variables.tf and set them only in the variable file, to completely eliminate configuration from your template.

Note

You could supply multiple variable files, with the ones defined later taking precedence over the ones defined earlier.

Variables are the first-class configuration in Terraform. With a set of simple but flexible ways to use and set them, you have a full control on how to create your environment. For production use, variable files are a must. You can write them yourself or use a script that fetches remote data and generates a variable file.

Variables are not the only source of configuration though. Since Terraform 0.7, there are data sources as well. Let's take a look on why and how we can use them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.230.82