So far, we know how to codify our infrastructure into Terraform templates of varying sizes. We know how to structure templates and how to split and reuse them with modules. More that this, we already figured out important concepts behind how Terraform works. But there is an important piece we almost didn't look at: configuration.
A template with only hardcoded data in it is a bad template. You can't reuse it in other projects without modifying it. You always have to update it by hand if some value changes. And you have to store a lot of information that doesn't really belong to the infrastructure template.
In this chapter, we will learn how to make Terraform templates more configurable. First, we will take a lookup
variables and all the possible ways to use them. Then, we will learn how to use data resources to retrieve information from outside Terraform template. Finally, we will use built-in provider's capabilities to generate random data, secrets, and config files. As a bonus, we will also take a very quick look at another HashiCorp tool: Consul.
If you've ever used any programming language, then you might be familiar with variables already. In most common case, they allow you to assign a value (a number or string or something else) to some hand-picked name and reference this value by this name inside your code. If you need to modify the value, then you just need to do it once, in a place where variable is defined.
Unlike in programming languages, variables in Terraform are more like input data for your templates: you define them before using the template. During the Terraform run, you have zero control over variables. The values of variables never change; you can't modify them inside the template.
In the previous chapter, we already tried variables in order to configure modules. We also learned that our template.tf
is a module: root module. Let's define some variables for the root module.
It is a common pattern to split variables, template, and outputs into three different files. As you might remember, Terraform loads all files with the .tf
extension from the current folder, so you don't need to do any extra steps to join these three files. Let's create a new file variables.tf
with the following content:
variable "region" {}
Now let's use it inside template.tf
to configure AWS provider:
provider "aws" { region = "${var.region}" }
If we would try to apply or plan template now, Terraform would interactively ask us for the value of this variable:
$> terraform plan
var.region
Enter a value:
That's nice and sometimes convenient, but in most cases, you don't want to type values of all variables every time. Not only it's inconvenient, but it could also be dangerous: mistyping a variable value can lead to terrible consequences. If one of the variables is used inside a resource parameter that causes its recreation, then mistyping it will lead to accidental removal of the resource. Don't rely on the manual input of configuration data.
We could reduce the chances of accidental infrastructure destruction by adding a description to the variable:
variable "region" { description = "AWS region. Changing it will lead to loss of complete stack." }
Now the user of the template will see this description when he or she tries to apply it:
$> terraform plan var.region AWS region. Changing it will lead to loss of complete stack. Enter a value:
It doesn't save us from typos though. What would be an even more reliable way to protect the infrastructure from human mistakes is to have a default value for the variable:
variable "region" { description = "AWS region. Changing it will lead to loss of complete stack." default = "eu-central-1" }
With the default value in place, Terraform won't ask for the value interactively anymore. It will pick default value unless other sources of variables are present.
There are three types of variables you can set:
string
variables (default ones)map
variableslist
variablesYou can set interactively only the string
variables; for map
and list
, you have to use other methods, which we will take a look at a bit later.
If you've used maps, dictionaries, or hashes in some programming language such as Ruby, then you know what Map in Terraform is. Map is a lookup
table, where you specify multiple keys with different values. You can then pick the value depending on the key. It's easier to understand it with the example.
At the moment our MightyTrousers
application always uses the t2.micro
instance type. These are cheap instances that are good for quick tests and development, but they are not that great for production. What we want, actually, is a way to use different instance types depending on the environment stack is deployed to. Let's assume that we have only three environments: dev
, prod
, and test
.
First, let's move variables out of the modules/application/application.tf
file to modules/application/variables.tf
. And then let's define two new variables there: environment
and instance_type
.
variable "environment" { default = "dev" } variable "instance_type" { type = "map" default = { dev = "t2.micro" test = "t2.medium" prod = "t2.large" } }
We specified type
explicitly, even though it's not really required when we have the default
value as well. The default type is string
.
What you also need to do is to add variable "environment"{default = "prod" }
to the variables.tf
file in the root
folder of our project. We will use prod
on top level to show that root module variable value will override the default of module itself.
Then, modify the module to look as follows:
module "mighty_trousers" { source = "./modules/application" vpc_id = "${aws_vpc.my_vpc.id}" subnet_id = "${aws_subnet.public.id}" name = "MightyTrousers" environment = "${var.environment}" }
Here, we pass a variable from the root module to the application module. We don't need to pass the instance_type
variable because we will just look at the value we need from the existing variable. To do this, Terraform provides the lookup()
interpolation function. This function accepts map
as the first argument, the key to look for in this map
as the second argument, and an optional default value as the third argument. Let's modify our modules/application/application.tf aws_instance
resource to look as follows:
resource "aws_instance" "app-server" { ami = "ami-9bf712f4" instance_type = "${lookup(var.instance_type, var.environment)}" subnet_id = "${var.subnet_id}" vpc_security_group_ids = ["${aws_security_group.allow_http.id}"] tags { Name = "${var.name}" } }
We did not specify the default value inside the lookup()
function; there is already a default on both module and root levels. Let's run the terraform plan
command to see which parameters the instance would get:
$> terraform plan < .... > + module.mighty_trousers.aws_instance.app-server ami: "ami-378f925b" availability_zone: "<computed>" ebs_block_device.#: "<computed>" ephemeral_block_device.#: "<computed>" instance_state: "<computed>" instance_type: "t2.large" key_name: "<computed>" network_interface_id: "<computed>" placement_group: "<computed>" private_dns: "<computed>" private_ip: "<computed>" public_dns: "<computed>" public_ip: "<computed>" root_block_device.#: "<computed>" security_groups.#: "<computed>" source_dest_check: "true" subnet_id: "${var.subnet_id}" tags.%: "1" tags.Name: "MightyTrousers" tenancy: "<computed>" vpc_security_group_ids.#: "<computed>"
Indeed it took the t2.large
instance type. Maps give you more flexibility compared with regular string variables. So create lists.
Continuing analogy with programming, list
in Terraform is similar to arrays in most programming languages. There is a very nice place where we can use lists in our templates: security group.
Currently, an application module defines a single security group and assigns it to the instance. But an EC2 instance can have multiple security groups attached. We could have a default security group that allows an SSH access and then on an application level we have another one for app-specific permissions.
Let's add yet another variable to module/application/variables.tf
, with an empty list as a default value:
variable "extra_sgs" { default = [] }
Now, let's define a default security group in template.tf
with a SSH access allowed:
resource "aws_security_group" "default" { name = "Default SG" description = "Allow SSH access" vpc_id = "${aws_vpc.my_vpc.id}" ingress { from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } }
Now we can pass it to the module, wrapping it with square brackets (which means it's a list):
module "mighty_trousers" { source = "./modules/application" vpc_id = "${aws_vpc.my_vpc.id}" subnet_id = "${aws_subnet.public.id}" name = "MightyTrousers" environment = "${var.environment}" extra_sgs = ["${aws_security_group.default.id}"] }
Now we only need to use this extra security group IDs together with an app-specific security group. To achieve this, we will use the concat()
interpolation function. This function joins multiple lists into one. We also better ensure that the resulting list doesn't have duplicates. The distinct()
function will help with this; it removes all the duplicates, keeping only the first occurrence of each non-unique element. We will join the extra_sgs
list with a list made from an app-specific SG defined in application.tf
:
resource "aws_instance" "app-server" { ami = "ami-9bf712f4" instance_type = "${lookup(var.instance_type, var.environment)}" subnet_id = "${var.subnet_id}" vpc_security_group_ids = ["${distinct(concat(var.extra_sgs, aws_security_group.allow_http.*.id))}"] tags { Name = "${var.name}" } }
The syntax might not look obvious here, especially if you are coming from programming background. It takes some time to get used to peculiarities of Terraform DSL. One would expect that because we have a single app-specific security group, we would simply wrap it with square brackets as follows:
["${concat(var.extra_sgs, [aws_security_group.allow_http.id]}"]
Unfortunately, it doesn't work like this. Internally, we defined that aws_security_group
is not a single resource, but a list consists of a single resource. Terraform doesn't have loops. Instead, it has a special syntax to iterate over the multiple resources with the *
symbol. In the background, we have the following:
aws_security_group.allow_http.*.id
Terraform transforms the preceding to something similar to the following:
[aws_security_group.allow_http.0.id]
Let's say we would have multiple groups that would result in the following:
[aws_security_group.allow_http.0.id .. aws_security_group.allow_http.N.id]
Here, N
is the number of groups. We didn't discuss how to create multiple instances of the same resource yet, though. That's the topic for another chapter.
Terraform language can be very confusing at times. Since version 0.8.0, there is a terraform console
command, which allows you to play around with different interpolation functions and other features in an interactive console. The console itself is quite unpredictable as well, but you should expect it to become more useful over time.
Both map
and list
allow building complex though sometimes not obvious constructions around Terraform variables via various interpolation function's usage. But so far, we still defined our variables only via default values. It's time to figure out how to do it differently. First, let's learn how to provide variable values inline with Terraform commands invocation.
The easiest (after interactive mode) way to set variable values is to specify them as an argument to Terraform command. It's done with the multiple -var
arguments to the command with the name and value of the variable following:
$> terraform plan -var 'environment=dev'
Note how instance type of the EC2 server is different because we set the variable environment
to dev
.
So far, we don't have any map
or list
variables for the root module. Let's add a list of CIDR blocks that are allowed to access default security group via SSH. Also, let's add map
with CIDR blocks for our subnets. We will have two blocks: for private and for public subnets accordingly. In the end, variables.tf
should look as follows:
variable "region" { description = "AWS region. Changing it will lead to loss of complete stack." default = "eu-central-1" } variable "environment" { default = "prod" } variable "allow_ssh_access" { description = "List of CIDR blocks that can access instances via SSH" default = ["0.0.0.0/0"] } variable "vpc_cidr" { default = "10.0.0.0/16" } variable "subnet_cidrs" { description = "CIDR blocks for public and private subnets" default = { public = "10.0.1.0/24" private = "10.0.2.0/24" } }
As an exercise, make use of these new variables yourself with the help of the lookup()
function.
If we would try to supply the allow_ssh_access
variable via command line, it could look like this:
$> terraform plan -var 'allow_ssh_access=["52.123.123.123/32"]'
If we would need to change CIDR blocks' map
, then we could do it as follows:
$> terraform plan -var 'subnet_cidrs={public = "172.0.16.0/24", private = "172.0.17.0/24"}'
Setting variables via CLI arguments can be useful sometimes: to provide a password for some service or to tweak some values for development purposes. But it is in no way a reliable and production-ready storage. There is a better option: environment variables.
The third way (after interactive input and inline arguments) to supply values to your variables is to use environment variables.
The environment variables are part of the environment where process is running and the program can access them. There are always some environment variables already set; for example, $PATH
defines paths where your shell will look for executables. You can get a list of currently set environment variables with the env
command on *nix operating systems.
Terraform will automatically read all environment variables with the TF_VAR_
prefix. For example, to set value for the region
variable, you would need to set the TF_VAR_region
environment variable.
There are multiple ways to set environment variables. You could do it inline with your terraform
command execution, as follows:
$> TF_VAR_region=eu-central-1 terraform plan
But that's not much different from setting variables with the -var
argument. Alternatively, you could set them once in your terminal:
$> export TF_VAR_subnet_cidrs='{public = "172.0.16.0/24", private = "172.0.17.0/24"}' $> terraform plan
This would set the variable value for the current terminal session. You could unset it with the unset
command:
$> unset TF_VAR_subnet_cidrs
It might be tempting to set all variables with export
, but it can cause some problems: you don't have an easy overview of which variable has which value.
Of course, you could store your environment variables inside a text file and source it. Create a file vars
with the following contents:
export TF_VAR_subnet_cidrs='{public = "172.0.16.0/24", private = "172.0.17.0/24"}' export TF_VAR_region=eu-central-1
Then, source
this file to make it available in your environment:
$> source vars $> terraform plan
This approach still doesn't prevent you from reassigning an environment variable by accident. Once again, environment variables are good for development, but it's not the best solution yet. An even way is to use variable files.
When running Terraform commands, you can optionally supply a variable file via the -var-file
argument. The syntax of these files is the good old HCL, familiar to you from Terraform templates themselves. Create a new file named development.tfvars
and set your variables there:
region = "eu-central-1" vpc_cidr = "172.0.0.0/16" subnet_cidrs = { public = "172.0.16.0/24" private = "172.0.17.0/24" }
To use it, run terraform plan
command with -var-file
argument:
$> terraform plan -var-file=./development.tfvars
It's much more reliable to use variable files for production stacks: you always know which values are there and you can store them in version control. And for sensitive things, such as a personal password to access the cloud account, you would still use environment variables or inline arguments.
In Chapter 7, Collaborative Infrastructure, we will discuss better ways to deal with sensitive data.
To go one step even further, you could remove all defaults from variables.tf
and set them only in the variable
file, to completely eliminate configuration from your template.
Variables are the first-class configuration in Terraform. With a set of simple but flexible ways to use and set them, you have a full control on how to create your environment. For production use, variable files are a must. You can write them yourself or use a script that fetches remote data and generates a variable file.
Variables are not the only source of configuration though. Since Terraform 0.7, there are data sources as well. Let's take a look on why and how we can use them.
3.144.230.82