Chapter 10. Infrastructure as Code

Before we had fancy DevOps titles and job descriptions, we were lowly system administrators, or sysadmins for short. Those were the dark, pre-cloud days when we had to load the trunks of our cars with bare-metal servers and drive to a colocation (colo) facility to rack the servers, wire them, attach a wheeled monitor/keyboard/mouse to them, and set them up one by one. Grig still shudders to think about the hours he spent in colos, in blinding light and freezing A/C. We had to be wizards at Bash scripting, then we graduated to Perl, and the more fortunate of us to Python. As the saying went, the internet circa 2004 was held together with duct tape and bubble gum.

Somewhere during the period of 2006 to 2007, we discovered the magical world of Amazon EC2 instances. We were able to provision servers through a simple point-and-click interface, or through command-line tools. No more driving to colocation facilities, no more stacking and wiring bare-metal servers. We could go wild and launch 10 EC2 instances at a time. Or even 20! Or even 100! The sky was the limit. However, we quickly figured out that manually connecting to each EC2 instance using SSH and then setting up our applications on every instance separately was not going to scale. It was fairly easy to provision the instances themselves. What was difficult was to install the required packages for our applications, add the correct users, make sure the file permissions looked right, and finally install and configure our applications. To scratch this itch, the first generation of infrastructure automation software came into being, represented by “configuration management” tools. Puppet was the first well-known configuration management tool, released in 2005 and predated the release of Amazon EC2. Other such tools that were launched on the heels of Puppet were Chef in 2008, followed by SaltStack in 2011, and Ansible in 2012.

By 2009, the world was ready to welcome the arrival of a new term: DevOps. To this day, there are competing definitions of DevOps. What is interesting is that it came into being in the tumultuous early days of infrastructure software automation. While there are important people and culture aspects to DevOps, one thing stands out in this chapter: the ability to automate the provisioning, configuration, and deployment of infrastructure and applications.

By 2011, it was getting hard to keep track of all the services comprising the Amazon Web Services (AWS) suite. The cloud was much more complicated than raw compute power (Amazon EC2) and object storage (Amazon S3). Applications started to rely on multiple services interacting with each other, and tools were needed to help automate the provisioning of these services. Amazon didn’t wait long to fill this need, and in 2011 it started offering just such a tool: AWS CloudFormation. This was one of the first moments when we could truly say that we were able to describe our infrastructure through code. CloudFormation opened the doors to a new generation of Infrastructure as Code (IaC) tools, which were operating at the layer of the cloud infrastructure itself, underneath the layer served by the first-generation configuration management tools.

By 2014, AWS had launched dozens of services. That was the year when another important tool in the world of IaC came into being: Terraform, by HashiCorp. To this day, the two most used IaC tools are CloudFormation and Terraform.

Another important development in the world of IaC and DevOps was taking place sometime between late 2013 and early 2014: the release of Docker, which came to be synonymous with container technologies. Although containers had been around for a number of years, the great benefit that Docker brought to the table was that it wrapped technologies such as Linux containers and cgroups into an easy-to-use API and command-line interface (CLI) toolset that significantly lowered the barrier of entry for people who wanted to package their applications into containers that could be deployed and run wherever Docker was running. Container technologies and container orchestration platforms are discussed in detail in Chapters 11 and 12.

The usage and mindshare of Docker exploded and damaged the popularity of the first-generation configuration management tools (Puppet, Chef, Ansible, SaltStack). The companies behind these tools are reeling at the moment and are all trying to stay afloat and current by reinventing themselves as cloud friendly. Before the advent of Docker, you would provision the infrastructure for your application with an IaC tool such as CloudFormation or Terraform, then deploy the application itself (code and configuration) with a configuration management tool such as Puppet, Chef, Ansible, or SaltStack. Docker suddenly made these configuration management tools obsolete, since it provided a means for you to package your application (code + configuration) in a Docker container that would then run inside the infrastructure provisioned by the IaC tools.

A Classification of Infrastructure Automation Tools

Fast-forward to 2020 and it is easy to feel lost as a DevOps practitioner when faced with the multitude of infrastructure automation tools available.

One way to differentiate IaC tools is by looking at the layer at which they operate. Tools such as CloudFormation and Terraform operate at the cloud infrastructure layer. They allow you to provision cloud resources such as compute, storage, and networking, as well as various services such as databases, message queues, data analytics, and many others. Configuration management tools such as Puppet, Chef, Ansible, and SaltStack typically operate at the application layer, making sure that all the required packages are installed for your application, and that the application itself is configured correctly (although many of these tools also have modules that can provision cloud resources). Docker also operates at the application layer.

Another way to compare IaC tools is by dividing them into declarative versus imperative categories. You can tell an automation tool what to do in a declarative manner where you describe the state of the system that you are trying to achieve. Puppet, CloudFormation, and Terraform operate in a declarative manner. Alternatively, you can use an automation tool in a procedural or imperative manner, where you specify the exact steps needed by the tool to achieve the desired system state. Chef and Ansible operate in an imperative manner. SaltStack can operate in both declarative and imperative manners.

Let’s look at the desired state of the system as a blueprint for the construction of a building, let’s say a stadium. You use procedural tools like Chef and Ansible to build the stadium, section by section and row by row inside each section. You need to keep track of the state of the stadium and the progress of the construction. Using declarative tools such as Puppet, CloudFormation, and Terraform, you first put together the blueprint for the stadium. The tool then makes sure that the construction achieves the state depicted in the blueprint.

Given this chapter’s title, we will focus the remaining discussion on IaC tools, which can be further classified along several dimensions.

One dimension is the way you specify the desired state of the system. In CloudFormation, you do it with JSON or YAML syntax, while in Terraform you do it with the proprietary HashiCorp Configuration Language (HCL) syntax. In contrast, Pulumi and the AWS Cloud Development Kit (CDK) allow you to use real programming languages, including Python, for specifying the desired state of the system.

Another dimension is the cloud providers supported by each tool. Since CloudFormation is an Amazon service, it stands to reason that it focuses on AWS (although one can define non-AWS resources with CloudFormation when using the custom resources feature). The same is true for the AWS CDK. In contrast, Terraform supports many cloud providers, as does Pulumi.

Because this is a book about Python, we would like to mention a tool called troposphere, which allows you to specify CloudFormation stack templates using Python code, and then exports them to JSON or YAML. Troposphere stops at the generation of the stack templates, which means that you need to provision the stacks using CloudFormation. One other tool that also uses Python and is worth mentioning is stacker. It uses troposphere under the covers, but it also provisions the generated CloudFormation stack templates.

The rest of this chapter shows two of these automation tools, Terraform and Pulumi, in action, each working on a common scenario, which is the deployment of a static website in Amazon S3, which is fronted by the Amazon CloudFront CDN and secured by an SSL certificate provisioned via the AWS Certificate Manager (ACM) service.

Note

Some of the commands used in the following examples produce large amounts of output. Except for cases where it is critical to the understanding of the command, we will omit the majority of the output lines to save trees and enable you to focus better on the text.

Manual Provisioning

We started by working through the scenario manually, using the AWS web-based console. Nothing like experiencing the pain of doing things manually so that you can better enjoy the results of automating tedious work!

We first followed the documentation from AWS website for hosting in S3.

We already had a domain name bought via Namecheap: devops4all.dev. We created a hosted zone in Amazon Route 53 for the domain, and pointed the name servers for this domain in Namecheap to AWS DNS servers handling the hosted domain.

We provisioned two S3 buckets, one for the root URL of the site (devops4all.dev) and one for the www URL (www.devops4all.dev). The idea was to redirect requests to www to the root URL. We also went through the guide and configured the buckets for static site hosting, with the proper permissions. We uploaded an index.html file and a JPG image to the root S3 bucket.

The next step was to provision an SSL certificate to handle both the root domain name (devops4all.dev) and any subdomain of that domain (*.devops4all.dev). For verification, we used DNS records that we added to the Route 53 hosted zone.

Note

The ACM certificate needs to be provisioned in the us-east-1 AWS region so that it can be used in CloudFront.

We then created an AWS CloudFront CDN distribution pointing to the root S3 bucket and used the ACM certificate provisioned in the previous step. We specified that HTTP requests should be redirected to HTTPS. Once the distribution was deployed (which took approximately 15 minutes), we added Route 53 records for the root domain and the www domain as A records of type Alias pointing to the CloudFront distribution endpoint DNS name.

At the end of this exercise, we were able to go to http://devops4all.dev, be redirected automatically to https://devops4all.dev, and see the home page of the site showing the image we uploaded. We also tried going to http://www.devops4all.dev and were redirected to https://devops4all.dev.

The manual creation of all the AWS resources we mentioned took approximately 30 minutes. We also spent 15 minutes waiting for the CloudFront distribution propagation, for a total of 45 minutes. Note that we had done all this before, so we knew exactly what to do, with only minimal reference to the AWS guide.

Note

It is worth taking a moment to appreciate how easy it is these days to provision a free SSL certificate. Gone are the days when you had to wait hours or even days for the SSL certificate provider to approve your request, only after you submitted proof that your company existed. Between AWS ACM, and Let’s Encrypt, there is no excuse not to have SSL enabled on all pages of your site in 2020.

Automated Infrastructure Provisioning with Terraform

We decided to use Terraform as the first IaC tool for the automation of these tasks, even though Terraform is not directly related to Python. It has several advantages, such as maturity, strong ecosystem, and multicloud provisioners.

The recommended way of writing Terraform code is to use modules, which are reusable components of Terraform configuration code. There is a common registry of Terraform modules hosted by HashiCorp where you can search for ready-made modules that you might use for provisioning the resources you need. In this example, we will write our own modules.

The version of Terraform used here is 0.12.1, which is the latest version at the time of this writing. Install it on a Mac by using brew:

$ brew install terraform

Provisioning an S3 Bucket

Create a modules directory and underneath it an s3 directory containing three files: main.tf, variables.tf, and outputs.tf. The main.tf file in the s3 directory tells Terraform to create an S3 bucket with a specific policy. It uses a variable called domain_name that is declared in variables.tf and whose value is passed to it by the caller of this module. It outputs the DNS endpoint of the S3 bucket, which will be used by other modules as an input variable.

Here are the three files in modules/s3:

$ cat modules/s3/main.tf
resource "aws_s3_bucket" "www" {
  bucket = "www.${var.domain_name}"
  acl = "public-read"
  policy = <<POLICY
{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Sid":"AddPerm",
      "Effect":"Allow",
      "Principal": "*",
      "Action":["s3:GetObject"],
      "Resource":["arn:aws:s3:::www.${var.domain_name}/*"]
    }
  ]
}
POLICY

  website {
    index_document = "index.html"
  }
}

$ cat modules/s3/variables.tf
variable "domain_name" {}

$ cat modules/s3/outputs.tf
output "s3_www_website_endpoint" {
  value = "${aws_s3_bucket.www.website_endpoint}"
}
Note

The policy attribute of the aws_s3_bucket resource above is an example of an S3 bucket policy that allows public access to the bucket. If you work with S3 buckets in an IaC context, it pays to familiarize yourself with the official AWS documentation on bucket and user policies.

The main Terraform script which ties together all the modules is a file called main.tf in the current directory:

$ cat main.tf
provider "aws" {
  region = "${var.aws_region}"
}

module "s3" {
  source = "./modules/s3"
  domain_name = "${var.domain_name}"
}

It refers to variables that are defined in a separate file called variables.tf:

$ cat variables.tf
variable "aws_region" {
  default = "us-east-1"
}

variable "domain_name" {
  default = "devops4all.dev"
}

Here is the current directory tree at this point:

|____main.tf
|____variables.tf
|____modules
| |____s3
| | |____outputs.tf
| | |____main.tf
| | |____variables.tf

The first step in running Terraform is to invoke the terraform init command, which will read the contents of any module referenced by the main file.

The next step is to run the terraform plan command, which creates the blueprint mentioned in the earlier discussion.

To create the resources specified in the plan, run terraform apply:

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.s3.aws_s3_bucket.www will be created
  + resource "aws_s3_bucket" "www" {
      + acceleration_status = (known after apply)
      + acl  = "public-read"
      + arn  = (known after apply)
      + bucket  = "www.devops4all.dev"
      + bucket_domain_name  = (known after apply)
      + bucket_regional_domain_name = (known after apply)
      + force_destroy = false
      + hosted_zone_id= (known after apply)
      + id= (known after apply)
      + policy  = jsonencode(
            {
              + Statement = [
                  + {
                      + Action = [
                          + "s3:GetObject",
                        ]
                      + Effect = "Allow"
                      + Principal = "*"
                      + Resource  = [
                          + "arn:aws:s3:::www.devops4all.dev/*",
                        ]
                      + Sid = "AddPerm"
                    },
                ]
              + Version= "2012-10-17"
            }
        )
      + region  = (known after apply)
      + request_payer = (known after apply)
      + website_domain= (known after apply)
      + website_endpoint = (known after apply)

      + versioning {
          + enabled = (known after apply)
          + mfa_delete = (known after apply)
        }

      + website {
          + index_document = "index.html"
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.s3.aws_s3_bucket.www: Creating...
module.s3.aws_s3_bucket.www: Creation complete after 7s [www.devops4all.dev]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

At this point, check that the S3 bucket was created using the AWS web console UI.

Provisioning an SSL Certificate with AWS ACM

The next module is created for the provisioning of an SSL certificate using the AWS Certificate Manager service. Create a directory called modules/acm with three files: main.tf, variables.tf, and outputs.tf. The main.tf file in the acm directory tells Terraform to create an ACM SSL certificate using DNS as the validation method. It uses a variable called domain_name which is declared in variables.tf and whose value is passed to it by the caller of this module. It outputs the ARN identifier of the certificate, which will be used by other modules as an input variable.

$ cat modules/acm/main.tf
resource "aws_acm_certificate" "certificate" {
  domain_name = "*.${var.domain_name}"
  validation_method = "DNS"
  subject_alternative_names = ["*.${var.domain_name}"]
}

$ cat modules/acm/variables.tf
variable "domain_name" {
}

$ cat modules/acm/outputs.tf
output "certificate_arn" {
  value = "${aws_acm_certificate.certificate.arn}"
}

Add a reference to the new acm module in the main Terraform file:

$ cat main.tf
provider "aws" {
  region = "${var.aws_region}"
}

module "s3" {
  source = "./modules/s3"
  domain_name = "${var.domain_name}"
}

module "acm" {
  source = "./modules/acm"
  domain_name = "${var.domain_name}"
}

The next three steps are the same as in the S3 bucket creation sequence: terraform init, terraform plan, and terraform apply.

Use the AWS console to add the necessary Route 53 records for the validation process. The certificate is normally validated and issued in a few minutes.

Provisioning an Amazon CloudFront Distribution

The next module is created for the provisioning of an Amazon CloudFront distribution. Create a directory called modules/cloudfront with three files: main.tf, variables.tf, and outputs.tf. The main.tf file in the cloudfront directory tells Terraform to create a CloudFront distribution resource. It uses several variables that are declared in variables.tf and whose values are passed to it by the caller of this module. It outputs the DNS domain name for the CloudFront endpoint and the hosted Route 53 zone ID for the CloudFront distribution, which will be used by other modules as input variables:

$ cat modules/cloudfront/main.tf
resource "aws_cloudfront_distribution" "www_distribution" {
  origin {
    custom_origin_config {
      // These are all the defaults.
      http_port= "80"
      https_port  = "443"
      origin_protocol_policy = "http-only"
      origin_ssl_protocols= ["TLSv1", "TLSv1.1", "TLSv1.2"]
    }

    domain_name = "${var.s3_www_website_endpoint}"
    origin_id= "www.${var.domain_name}"
  }

  enabled  = true
  default_root_object = "index.html"

  default_cache_behavior {
    viewer_protocol_policy = "redirect-to-https"
    compress = true
    allowed_methods= ["GET", "HEAD"]
    cached_methods = ["GET", "HEAD"]
    target_origin_id = "www.${var.domain_name}"
    min_ttl  = 0
    default_ttl = 86400
    max_ttl  = 31536000

    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }
  }

  aliases = ["www.${var.domain_name}"]

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  viewer_certificate {
    acm_certificate_arn = "${var.acm_certificate_arn}"
    ssl_support_method  = "sni-only"
  }
}

$ cat modules/cloudfront/variables.tf
variable "domain_name" {}
variable "acm_certificate_arn" {}
variable "s3_www_website_endpoint" {}

$ cat modules/cloudfront/outputs.tf
output "domain_name" {
  value = "${aws_cloudfront_distribution.www_distribution.domain_name}"
}

output "hosted_zone_id" {
  value = "${aws_cloudfront_distribution.www_distribution.hosted_zone_id}"
}

Add a reference to the cloudfront module in the main Terraform file. Pass s3_www_website_endpoint and acm_certificate_arn as input variables to the cloudfront module. Their values are retrieved from the outputs of the other modules, s3 and acm, respectively.

Note

ARN stands for Amazon Resource Name. It is a string that uniquely identifies a given AWS resource. You will see many ARN values generated and passed around as variables when you use IaC tools that operate within AWS.

$ cat main.tf
provider "aws" {
  region = "${var.aws_region}"
}

module "s3" {
  source = "./modules/s3"
  domain_name = "${var.domain_name}"
}

module "acm" {
  source = "./modules/acm"
  domain_name = "${var.domain_name}"
}

module "cloudfront" {
  source = "./modules/cloudfront"
  domain_name = "${var.domain_name}"
  s3_www_website_endpoint = "${module.s3.s3_www_website_endpoint}"
  acm_certificate_arn = "${module.acm.certificate_arn}"
}

The next three steps are the usual ones for the provisioning of resources with Terraform: terraform init, terraform plan, and terraform apply.

The terraform apply step took almost 23 minutes in this case. Provisioning an Amazon CloudFront distribution is one of the lengthiest operations in AWS, because the distribution is being deployed globally by Amazon behind the scenes.

Provisioning a Route 53 DNS Record

The next module was for the creation of a Route 53 DNS record for the main domain of the site www.devops4all.dev. Create a directory called modules/route53 with two files: main.tf and variables.tf. The main.tf file in the route53 directory tells Terraform to create a Route 53 DNS record of type A as an alias to the DNS name of the CloudFront endpoint. It uses several variables that are declared in variables.tf and whose values are passed to it by the caller of this module:

$ cat modules/route53/main.tf
resource "aws_route53_record" "www" {
  zone_id = "${var.zone_id}"
  name = "www.${var.domain_name}"
  type = "A"

  alias {
    name  = "${var.cloudfront_domain_name}"
    zone_id  = "${var.cloudfront_zone_id}"
    evaluate_target_health = false
  }
}

$ cat modules/route53/variables.tf
variable "domain_name" {}
variable "zone_id" {}
variable "cloudfront_domain_name" {}
variable "cloudfront_zone_id" {}

Add a reference to the route53 module in the main.tf Terraform file. Pass zone_id, cloudfront_domain_name, and cloudfront_zone_id as input variables to the route53 module. The value of zone_id is declared in variables.tf in the current directory, while the other values are retrieved from the outputs of the cloudfront module:

$ cat main.tf
provider "aws" {
  region = "${var.aws_region}"
}

module "s3" {
  source = "./modules/s3"
  domain_name = "${var.domain_name}"
}

module "acm" {
  source = "./modules/acm"
  domain_name = "${var.domain_name}"
}

module "cloudfront" {
  source = "./modules/cloudfront"
  domain_name = "${var.domain_name}"
  s3_www_website_endpoint = "${module.s3.s3_www_website_endpoint}"
  acm_certificate_arn = "${module.acm.certificate_arn}"
}

module "route53" {
  source = "./modules/route53"
  domain_name = "${var.domain_name}"
  zone_id = "${var.zone_id}"
  cloudfront_domain_name = "${module.cloudfront.domain_name}"
  cloudfront_zone_id = "${module.cloudfront.hosted_zone_id}"
}

$ cat variables.tf
variable "aws_region" {
  default = "us-east-1"
}

variable "domain_name" {
  default = "devops4all.dev"
}

variable "zone_id" {
  default = "ZWX18ZIVHAA5O"
}

The next three steps, which should be very familiar to you by now, are for the provisioning of resources with Terraform: terraform init, terraform plan, and terraform apply.

Copying Static Files to S3

To test the provisioning of the static website from end to end, create a simple file called index.html that includes a JPEG image, and copy both files to the S3 bucket previously provisioned with Terraform. Make sure that the AWS_PROFILE environment variable is set to a correct value already present in the ~/.aws/credentials file:

$ echo $AWS_PROFILE
gheorghiu-net
$ aws s3 cp static_files/index.html s3://www.devops4all.dev/index.html
upload: static_files/index.html to s3://www.devops4all.dev/index.html
$ aws s3 cp static_files/devops4all.jpg s3://www.devops4all.dev/devops4all.jpg
upload: static_files/devops4all.jpg to s3://www.devops4all.dev/devops4all.jpg

Visit https://www.devops4all.dev/ and verify that you can see the JPG image that was uploaded.

Deleting All AWS Resources Provisioned with Terraform

Whenever you provision cloud resources, you need to be mindful of the cost associated with them. It is very easy to forget about them, and you may be surprised by the AWS bill you receive at the end of the month. Make sure to delete all the resources provisioned above. Remove these resources by running the terraform destroy command. One more thing to note is that the contents of the S3 bucket need to be removed before running terraform destroy because Terraform will not delete a nonempty bucket.

Note

Before running the terraform destroy command, make sure you will not delete resources that might still be used in production!

Automated Infrastructure Provisioning with Pulumi

Pulumi is one of the new kids on the block when it comes to IaC tools. The keyword here is new, which means it is still somewhat rough around the edges, especially in regards to Python support.

Pulumi allows you to specify the desired state of your infrastructure by telling it which resources to provision using real programming languages. TypeScript was the first language supported by Pulumi, but nowadays Go and Python are also supported.

It is important to understand the difference between writing infrastructure automation code in Python using Pulumi and an AWS automation library such as Boto.

With Pulumi, your Python code describes the resources that you want to be provisioned. You are, in effect, creating the blueprint or the state discussed at the beginning of the chapter. This makes Pulumi similar to Terraform, but the big difference is that Pulumi gives you the full power of a programming language such as Python in terms of writing functions, loops, using variables, etc. You are not hampered by the use of a markup language such as Terraform’s HCL. Pulumi combines the power of a declarative approach, where you describe the desired end state, with the power of a real programming language.

With an AWS automation library such as Boto, you both describe and provision individual AWS resources through the code you write. There is no overall blueprint or state. You need to keep track of the provisioned resources yourself, and to orchestrate their creation and removal. This is the imperative or procedural approach for automation tools. You still get the advantage of writing Python code.

To start using Pulumi, create a free account on their website pulumi.io. Then you can install the pulumi command-line tool on your local machine. On a Macintosh, use Homebrew to install pulumi.

The first command to run locally is pulumi login:

$ pulumi login
Logged into pulumi.com as griggheo (https://app.pulumi.com/griggheo)

Creating a New Pulumi Python Project for AWS

Create a directory called proj1, run pulumi new in that directory, and chose the aws-python template. As part of the project creation, pulumi asks for the name of a stack. Call it staging:

$ mkdir proj1
$ cd proj1
$ pulumi new
Please choose a template: aws-python        A minimal AWS Python Pulumi program
This command will walk you through creating a new Pulumi project.

Enter a value or leave blank to accept the (default), and press <ENTER>.
Press ^C at any time to quit.

project name: (proj1)
project description: (A minimal AWS Python Pulumi program)
Created project 'proj1'

stack name: (dev) staging
Created stack 'staging'

aws:region: The AWS region to deploy into: (us-east-1)
Saved config

Your new project is ready to go!
To perform an initial deployment, run the following commands:

   1. virtualenv -p python3 venv
   2. source venv/bin/activate
   3. pip3 install -r requirements.txt

Then, run 'pulumi up'

It is important to understand the difference between a Pulumi project and a Pulumi stack. A project is the code you write for specifying the desired state of the system, the resources you want Pulumi to provision. A stack is a specific deployment of the project. For example, a stack can correspond to an environment such as development, staging, or production. In the examples that follow, we will create two Pulumi stacks, one called staging that corresponds to a staging environment, and further down the line, another stack called prod that corresponds to a production environment.

Here are the files automatically generated by the pulumi new command as part of the aws-python template:

$ ls -la
total 40
drwxr-xr-x   7 ggheo  staff  224 Jun 13 21:43 .
drwxr-xr-x  11 ggheo  staff  352 Jun 13 21:42 ..
-rw-------   1 ggheo  staff   12 Jun 13 21:43 .gitignore
-rw-r--r--   1 ggheo  staff   32 Jun 13 21:43 Pulumi.staging.yaml
-rw-------   1 ggheo  staff   77 Jun 13 21:43 Pulumi.yaml
-rw-------   1 ggheo  staff  184 Jun 13 21:43 __main__.py
-rw-------   1 ggheo  staff   34 Jun 13 21:43 requirements.txt

Follow the instructions in the output of pulumi new and install virtualenv, then create a new virtualenv environment and install the libraries specified in requirements.txt:

$ pip3 install virtualenv
$ virtualenv -p python3 venv
$ source venv/bin/activate
(venv) pip3 install -r requirements.txt
Note

Before provisioning any AWS resources with pulumi up, make sure you are using the AWS account that you are expecting to target. One way to specify the desired AWS account is to set the AWS_PROFILE environment variable in your current shell. In our case, an AWS profile called gheorghiu-net was already set up in the local ~/.aws/credentials file.

(venv) export AWS_PROFILE=gheorghiu-net

The __main__.py file generated by Pulumi as part of the aws-python template is as follows:

$ cat __main__.py
import pulumi
from pulumi_aws import s3

# Create an AWS resource (S3 Bucket)
bucket = s3.Bucket('my-bucket')

# Export the name of the bucket
pulumi.export('bucket_name',  bucket.id)

Clone the Pulumi examples GitHub repository locally, then copy __main__.py and the www directory from pulumi-examples/aws-py-s3-folder into the current directory.

Here is the new __main__.py file in the current directory:

$ cat __main__.py
import json
import mimetypes
import os

from pulumi import export, FileAsset
from pulumi_aws import s3

web_bucket = s3.Bucket('s3-website-bucket', website={
    "index_document": "index.html"
})

content_dir = "www"
for file in os.listdir(content_dir):
    filepath = os.path.join(content_dir, file)
    mime_type, _ = mimetypes.guess_type(filepath)
    obj = s3.BucketObject(file,
        bucket=web_bucket.id,
        source=FileAsset(filepath),
        content_type=mime_type)

def public_read_policy_for_bucket(bucket_name):
    return json.dumps({
        "Version": "2012-10-17",
        "Statement": [{
            "Effect": "Allow",
            "Principal": "*",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                f"arn:aws:s3:::{bucket_name}/*",
            ]
        }]
    })

bucket_name = web_bucket.id
bucket_policy = s3.BucketPolicy("bucket-policy",
    bucket=bucket_name,
    policy=bucket_name.apply(public_read_policy_for_bucket))

# Export the name of the bucket
export('bucket_name',  web_bucket.id)
export('website_url', web_bucket.website_endpoint)

Note the use of Python variables for content_dir and bucket_name, the use of a for loop, and also the use of a regular Python function public_read_policy_for_bucket. It is refreshing to be able to use regular Python constructs in IaC programs!

Now it’s time to run pulumi up to provision the resources specified in __main__.py. This command will show all the resources that will be created. Moving the current choice to yes will kick off the provisioning process:

(venv) pulumi up
Previewing update (staging):

     Type                    Name               Plan
 +   pulumi:pulumi:Stack     proj1-staging      create
 +   ├─ aws:s3:Bucket        s3-website-bucket  create
 +   ├─ aws:s3:BucketObject  favicon.png        create
 +   ├─ aws:s3:BucketPolicy  bucket-policy      create
 +   ├─ aws:s3:BucketObject  python.png         create
 +   └─ aws:s3:BucketObject  index.html         create

Resources:
    + 6 to create

Do you want to perform this update? yes
Updating (staging):

     Type                    Name               Status
 +   pulumi:pulumi:Stack     proj1-staging      created
 +   ├─ aws:s3:Bucket        s3-website-bucket  created
 +   ├─ aws:s3:BucketObject  index.html         created
 +   ├─ aws:s3:BucketObject  python.png         created
 +   ├─ aws:s3:BucketObject  favicon.png        created
 +   └─ aws:s3:BucketPolicy  bucket-policy      created

Outputs:
    bucket_name: "s3-website-bucket-8e08f8f"
    website_url: "s3-website-bucket-8e08f8f.s3-website-us-east-1.amazonaws.com"

Resources:
    + 6 created

Duration: 14s

Inspect the existing Pulumi stacks:

(venv) pulumi stack ls
NAME      LAST UPDATE    RESOURCE COUNT  URL
staging*  2 minutes ago  7        https://app.pulumi.com/griggheo/proj1/staging

(venv) pulumi stack
Current stack is staging:
    Owner: griggheo
    Last updated: 3 minutes ago (2019-06-13 22:05:38.088773 -0700 PDT)
    Pulumi version: v0.17.16
Current stack resources (7):
    TYPE                              NAME
    pulumi:pulumi:Stack               proj1-staging
    pulumi:providers:aws              default
    aws:s3/bucket:Bucket              s3-website-bucket
    aws:s3/bucketPolicy:BucketPolicy  bucket-policy
    aws:s3/bucketObject:BucketObject  index.html
    aws:s3/bucketObject:BucketObject  favicon.png
    aws:s3/bucketObject:BucketObject  python.png

Inspect the outputs of the current stack:

(venv) pulumi stack output
Current stack outputs (2):
    OUTPUT       VALUE
    bucket_name  s3-website-bucket-8e08f8f
    website_url  s3-website-bucket-8e08f8f.s3-website-us-east-1.amazonaws.com

Visit the URL specified in the website_url output (http://s3-website-bucket-8e08f8f.s3-website-us-east-1.amazonaws.com) and make sure you can see the static site.

In the sections that follow, the Pulumi project will be enhanced by specifying more AWS resources to be provisioned. The goal is to have parity with the resources that were provisioned with Terraform: an ACM SSL certificate, a CloudFront distribution, and a Route 53 DNS record for the site URL.

Creating Configuration Values for the Staging Stack

The current stack is staging. Rename the existing www directory to www-staging, then use the pulumi config set command to specify two configuration values for the current staging stack: domain_name and local_webdir.

Tip

For more details on how Pulumi manages configuration values and secrets, see the Pulumi reference documentation.

(venv) mv www www-staging
(venv) pulumi config set local_webdir www-staging
(venv) pulumi config set domain_name staging.devops4all.dev

To inspect the existing configuration values for the current stack, run:

(venv) pulumi config
KEY           VALUE
aws:region    us-east-1
domain_name   staging.devops4all.dev
local_webdir  www-staging

Once the configuration values are set, use them in the Pulumi code:

import pulumi

config = pulumi.Config('proj1')  # proj1 is project name defined in Pulumi.yaml

content_dir = config.require('local_webdir')
domain_name = config.require('domain_name')

Now that the configuration values are in place; next we will provision an SSL certificate with the AWS Certificate Manager service.

Provisioning an ACM SSL Certificate

Around this point, Pulumi starts to show its rough edges when it comes to its Python SDK. Just reading the Pulumi Python SDK reference for the acm module is not sufficient to make sense of what you need to do in your Pulumi program.

Fortunately, there are many Pulumi examples in TypeScript that you can take inspiration from. One such example that illustrated our use case is aws-ts-static-website.

Here is the TypeScript code for creating a new ACM certificate (from index.ts):

const certificate = new aws.acm.Certificate("certificate", {
    domainName: config.targetDomain,
    validationMethod: "DNS",
}, { provider: eastRegion });

Here is the equivalent Python code that we wrote:

from pulumi_aws import acm

cert = acm.Certificate('certificate', domain_name=domain_name,
    validation_method='DNS')
Tip

A rule of thumb in porting Pulumi code from TypeScript to Python is that parameter names that are camelCased in TypeScript become snake_cased in Python. As you can see in the earlier example, domainName becomes domain_name and validationMethod becomes validation_method.

Our next step was to provision a Route 53 zone and in that zone a DNS validation record for the ACM SSL certificate.

Provisioning a Route 53 Zone and DNS Records

Provisioning a new Route 53 zone with Pulumi is easy if you follow the Pulumi SDK reference for route53.

from pulumi_aws import route53

domain_name = config.require('domain_name')

# Split a domain name into its subdomain and parent domain names.
# e.g. "www.example.com" => "www", "example.com".
def get_domain_and_subdomain(domain):
  names = domain.split(".")
  if len(names) < 3:
    return('', domain)
  subdomain = names[0]
  parent_domain = ".".join(names[1:])
  return (subdomain, parent_domain)

(subdomain, parent_domain) = get_domain_and_subdomain(domain_name)
zone = route53.Zone("route53_zone", name=parent_domain)

The preceding snippet shows how to use a regular Python function to split the configuration value read into the domain_name variable into two parts. If domain_name is staging.devops4all.dev, the function will split it into subdomain (staging) and parent_domain (devops4all.dev).

The parent_domain variable is then used as a parameter to the constructor of the zone object, which tells Pulumi to provision a route53.Zone resource.

Note

Once the Route 53 zone was created, we had to point the Namecheap name servers at the name servers specified in the DNS record for the new zone so that the zone can be publicly accessible.

All was well and good so far. The next step was to create both the ACM certificate and a DNS record to validate the certificate.

We first tried to port the example TypeScript code by applying the rule of thumb of turning camelCase parameter names into snake_case.

TypeScript:

    const certificateValidationDomain = new aws.route53.Record(
        `${config.targetDomain}-validation`, {
        name: certificate.domainValidationOptions[0].resourceRecordName,
        zoneId: hostedZoneId,
        type: certificate.domainValidationOptions[0].resourceRecordType,
        records: [certificate.domainValidationOptions[0].resourceRecordValue],
        ttl: tenMinutes,
    });

The first attempt at porting to Python by switching camelCase to snake_case:

cert = acm.Certificate('certificate',
    domain_name=domain_name, validation_method='DNS')

domain_validation_options = cert.domain_validation_options[0]

cert_validation_record = route53.Record(
  'cert-validation-record',
  name=domain_validation_options.resource_record_name,
  zone_id=zone.id,
  type=domain_validation_options.resource_record_type,
  records=[domain_validation_options.resource_record_value],
  ttl=600)

No luck. pulumi up shows this error:

AttributeError: 'dict' object has no attribute 'resource_record_name'

At this point, we were stumped, because the Python SDK documentation doesn’t include this level of detail. We did not know what attributes we needed to specify for the domain_validation_options object.

We were only able to get past this by adding the domain_validation_options object to the list of Pulumi exports, which are printed out by Pulumi at the end of the pulumi up operation:

export('domain_validation_options', domain_validation_options)

The output from pulumi up was:

+ domain_validation_options: {
  + domain_name        : "staging.devops4all.dev"
  + resourceRecordName : "_c5f82e0f032d0f4f6c7de17fc2c.staging.devops4all.dev."
  + resourceRecordType : "CNAME"
  + resourceRecordValue: "_08e3d475bf3aeda0c98.ltfvzjuylp.acm-validations.aws."
    }

Bingo! It turns out that the attributes of the domain_validation_options object are still camelCased.

Here is the second attempt at porting to Python, which was successful:

cert_validation_record = route53.Record(
  'cert-validation-record',
  name=domain_validation_options['resourceRecordName'],
  zone_id=zone.id,
  type=domain_validation_options['resourceRecordType'],
  records=[domain_validation_options['resourceRecordValue']],
  ttl=600)

Next, specify a new type of resource to be provisioned: a certificate validation completion resource. This causes the pulumi up operation to wait until ACM validates the certificate by checking the Route 53 validation record created earlier.

cert_validation_completion = acm.CertificateValidation(
        'cert-validation-completion',
        certificate_arn=cert.arn,
        validation_record_fqdns=[cert_validation_dns_record.fqdn])

cert_arn = cert_validation_completion.certificate_arn

At this point, you have a fully automated way of provisioning an ACM SSL certificate and of validating it via DNS.

The next step is to provision the CloudFront distribution in front of the S3 bucket hosting the static files for the site.

Provisioning a CloudFront Distribution

Use the SDK reference for the Pulumi cloudfront module to figure out which constructor parameters to pass to cloudfront.Distribution. Inspect the TypeScript code to know what the proper values are for those parameters.

Here is the final result:

log_bucket = s3.Bucket('cdn-log-bucket', acl='private')

cloudfront_distro = cloudfront.Distribution ( 'cloudfront-distro',
    enabled=True,
    aliases=[ domain_name ],
    origins=[
        {
          'originId': web_bucket.arn,
          'domainName': web_bucket.website_endpoint,
          'customOriginConfig': {
              'originProtocolPolicy': "http-only",
              'httpPort': 80,
              'httpsPort': 443,
              'originSslProtocols': ["TLSv1.2"],
            },
        },
    ],

    default_root_object="index.html",
    default_cache_behavior={
        'targetOriginId': web_bucket.arn,

        'viewerProtocolPolicy': "redirect-to-https",
        'allowedMethods': ["GET", "HEAD", "OPTIONS"],
        'cachedMethods': ["GET", "HEAD", "OPTIONS"],

        'forwardedValues': {
            'cookies': { 'forward': "none" },
            'queryString': False,
        },

        'minTtl': 0,
        'defaultTtl': 600,
        'maxTtl': 600,
    },
    price_class="PriceClass_100",
    custom_error_responses=[
        { 'errorCode': 404, 'responseCode': 404,
          'responsePagePath': "/404.html" },
    ],

    restrictions={
        'geoRestriction': {
            'restrictionType': "none",
        },
    },
    viewer_certificate={
        'acmCertificateArn': cert_arn,
        'sslSupportMethod': "sni-only",
    },
    logging_config={
        'bucket': log_bucket.bucket_domain_name,
        'includeCookies': False,
        'prefix': domain_name,
    })

Run pulumi up to provision the CloudFront distribution.

Provisioning a Route 53 DNS Record for the Site URL

The last step in the end-to-end provisioning of the resources for the staging stack was the relatively simple task of specifying a DNS record of type A as an alias to the domain of the CloudFront endpoint:

site_dns_record = route53.Record(
        'site-dns-record',
        name=subdomain,
        zone_id=zone.id,
        type="A",
        aliases=[
        {
            'name': cloudfront_distro.domain_name,
            'zoneId': cloudfront_distro.hosted_zone_id,
            'evaluateTargetHealth': True
        }
    ])

Run pulumi up as usual.

Visit https://staging.devops4all.dev and see the files uploaded to S3. Go to the logging bucket in the AWS console and make sure the CloudFront logs are there.

Let’s see how to deploy the same Pulumi project to a new environment, represented by a new Pulumi stack.

Creating and Deploying a New Stack

We decided to modify the Pulumi program so that it does not provision a new Route 53 zone, but instead uses the value of the zone ID for an existing zone as a configuration value.

To create the prod stack, use the command pulumi stack init and specify prod for its name:

(venv) pulumi stack init
Please enter your desired stack name: prod
Created stack 'prod'

Listing the stacks now shows the two stacks, staging and prod, with an asterisk next to prod signifying that prod is the current stack:

(venv) pulumi stack ls
NAME     LAST UPDATE     RESOURCE COUNT  URL
prod*    n/a             n/a      https://app.pulumi.com/griggheo/proj1/prod
staging  14 minutes ago  14       https://app.pulumi.com/griggheo/proj1/staging

Now it’s time to set the proper configuration values for the prod stack. Use a new dns_zone_id configuration value, set to the ID of the zone that was already created by Pulumi when it provisioned the staging stack:

(venv) pulumi config set aws:region us-east-1
(venv) pulumi config set local_webdir www-prod
(venv) pulumi config set domain_name www.devops4all.dev
(venv) pulumi config set dns_zone_id Z2FTL2X8M0EBTW

Change the code to read zone_id from the configuration and to not create the Route 53 zone object.

Run pulumi up to provision the AWS resources:

(venv) pulumi up
Previewing update (prod):

     Type                            Name               Plan
     pulumi:pulumi:Stack             proj1-prod
 +   ├─ aws:cloudfront:Distribution  cloudfront-distro  create
 +   └─ aws:route53:Record           site-dns-record    create

Resources:
    + 2 to create
    10 unchanged

Do you want to perform this update? yes
Updating (prod):

     Type                            Name               Status
     pulumi:pulumi:Stack             proj1-prod
 +   ├─ aws:cloudfront:Distribution  cloudfront-distro  created
 +   └─ aws:route53:Record           site-dns-record    created

Outputs:
+ cloudfront_domain: "d3uhgbdw67nmlc.cloudfront.net"
+ log_bucket_id    : "cdn-log-bucket-53d8ea3"
+ web_bucket_id    : "s3-website-bucket-cde"
+ website_url      : "s3-website-bucket-cde.s3-website-us-east-1.amazonaws.com"

Resources:
    + 2 created
    10 unchanged

Duration: 18m54s

Success! The prod stack was fully deployed.

However, the contents of the www-prod directory containing the static files for the site are identical at this point to the contents of the www-staging directory.

Modify www-prod/index.html to change “Hello, S3!” to “Hello, S3 production!”, then run pulumi up again to detect the changes and upload the modified file to S3:

(venv) pulumi up
Previewing update (prod):

     Type                    Name        Plan       Info
     pulumi:pulumi:Stack     proj1-prod
 ~   └─ aws:s3:BucketObject  index.html  update     [diff: ~source]

Resources:
    ~ 1 to update
    11 unchanged

Do you want to perform this update? yes
Updating (prod):

     Type                    Name        Status      Info
     pulumi:pulumi:Stack     proj1-prod
 ~   └─ aws:s3:BucketObject  index.html  updated     [diff: ~source]

Outputs:
cloudfront_domain: "d3uhgbdw67nmlc.cloudfront.net"
log_bucket_id    : "cdn-log-bucket-53d8ea3"
web_bucket_id    : "s3-website-bucket-cde"
website_url      : "s3-website-bucket-cde.s3-website-us-east-1.amazonaws.com"

Resources:
    ~ 1 updated
    11 unchanged

Duration: 4s

Invalidate the cache of the CloudFront distribution to see the change.

Visit https://www.devops4all.dev and see the message: Hello, S3 production!

One caveat about IaC tools that keep track of the state of the system: there are situations when the state as seen by the tool will be different from the actual state. In that case, it is important to synchronize the two states; otherwise, they will drift apart more and more and you will be in the situation where you don’t dare make any more changes for fear that you will break production. It’s not for nothing that the word Code is prominent in Infrastructure as Code. Once you commit to using an IaC tool, best practices say that you should provision all resources via code, and no longer spin up any resource manually. It is hard to maintain this discipline, but it pays dividends in the long run.

Exercises

  • Provision the same set of AWS resources by using the AWS Cloud Development Kit.

  • Use Terraform or Pulumi to provision cloud resources from other cloud providers, such as Google Cloud Platform or Microsoft Azure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.106.100