Chapter 5. Building a Microservice Infrastructure

In the previous chapter we built a CI/CD pipeline for infrastructure changes. The infrastructure for our Microservices system will be defined in code and we’ll be able to use the pipeline to automate the testing and implementation of that code. With our automated pipeline in place, we can start writing the code that will define the infrastructure for our Microservices based application. That’s what we’ll focus on in this chapter.

Setting up the right infrastructure is vital to getting the most out of your Microservices system. Microservices give us a nice way of breaking the parts of our application into bite-sized pieces. But, we’ll need a lot of supporting infrastructure to make all those bite-sized services work together properly. Before we can tackle the challenges of designing and engineering the services themselves, we’ll need to spend some time establishing a network architecture and a deployment architecture for the services to use.

In this chapter, we’ll write Terraform code to define a declarative, immutable infrastructure to enable the Microservices we’ll be building. We’ll create a software defined network in AWS, implement a managed Kubernetes service, install a service mesh and setup a continuous delivery server. We’ll use the pipeline we built in the last chapter to test and apply these changes with a “sandbox” environment.

By the end of this chapter, you’ll have a cloud-based infrastructure configuration that’s designed to host the Microservices we’ll be building in the next chapter. But, before we get to the code, let’s start by taking a tour of the infrastructure that we’ll be building.

Infrastructure Components

The infrastructure is the set of components that we need to put in place so that we can deploy, manage and support our Microservices based application. A Microservices infrastructure can include a lot of parts: hardware, software, networks and tools. That means that the scope of components we’ll need to setup is quite large and getting all of those parts up and running is a gargantuan task.

Thankfully, as Microservices and “cloud native” approaches have matured, there has been an explosion in tools and services that are designed to meet much of our infrastructure needs. Keeping in line with our guiding design principles, we’ll use tools as much as we can and we’ll focus on getting the infrastructure to work with a single cloud platform (AWS) rather than making something that is cloud agnostic and portable. These decisions will make it possible for us to define a feature rich Microservices infrastructure in the small space of this single chapter.

But, keep in mind that we won’t end up with a complete, production ready infrastructure. For example, we won’t be going into any detail on security event management, operations controls or system logging and support. But, by the end of this chapter we’ll have built a powerful infrastructure stack that can run our Microservices resiliently, efficiently and safely.

Our infrastructure will consist of three major components: a software defined network, a Kubernetes service for orchestrating containers and a service mesh. We’ll also be installing a continuous deployment server called ArgoCD as a last step. Let’s walk through each of these parts in more detail, starting with the network.

The Network

We can’t run Microservices without a network. So, we’ll need to make sure that we design a network that works for the architecture that we plan to build. We’ve already made the decision to deploy our services into a Cloud architecture, which means that we’ll be creating a virtual cloud based network instead of building a physical one. We won’t be dealing with physical routers, cables or network devices, but we will need to configure the network resources that our Cloud Vendor provides for us. In our case, we’ve made a decision to use AWS for our Cloud Foundation, so we’ll be building our virtual network using AWS’ Virtual Private Cloud (VPC) system.

Ultimately, we plan to run our Microservices in Docker containers on top of AWS’ managed Kubernetes system. That has some big implications for the way we design our network. We’ll need to configure a special traffic rules and we’ll need the network to support more than one availability zone to support fail-over cases. That means, our deployment architecture will be split across multiple physical data centres and if one of those data centres fails, traffic can be shifted over to the working servers in the data centres that are still operational.

When we’re done, our network will consist of a VPC, two public subnets that are connected to the Internet and two private subnets that will host our Microservices. We’ll also define routing and security rules so that the network works the way we want and they way that our Kubernetes services expects.

The Kubernetes Service

In the opening chapters of this book, we highlighted the importance of independence as a feature for Microservices. We also, established that we would build on Heroku’s Twelve Factor application development principles. The application packaging method of containerization embodies the twelve principle approach and gives us incredible independence for our services. Containers give us the advantages of running applications in a predictable, isolated system configuration without the overhead and heavy lifting that comes with a virtual machine deployment. Microservices and containers are a natural fit.

Containers make it easy for us to build Microservices that run predictably across environments as a self-contained unit. But, containers don’t know how to start themselves, scale themselves up or know how to heal themselves when they break. Containers work great in isolation, but there is a lot of operations work needed to mange them in production-like environments. That’s where Kubernetes comes in.

Kubernetes is a container orchestration tool developed by Google that solves the problems of working with containers at a system level. It provides a tool based solution for deploying, scaling, observing and managing container based applications. It can help you rollout and rollback container deployments, automatically deploy or destroy container deployments based on demand patterns, mount storage systems, manage secrets and help with load balancing and traffic management. Kubernetes can do a lot of complicated and complex work and it’s quickly become an essential part of a Microservices Infrastructure stack.

But, Kubernetes is also pretty complicated itself. That means we won’t be diving into the details of how Kubernetes works in this book. But, we will be able to put together a working Kubernetes infrastructure hosted on AWS. We’ll be able to use a managed Kubernetes service from Amazon called the Elastic Kubernetes Service (EKS) that will handle some of the Kubernetes complexity for us. When we’re done we will have a working Kubernetes cluster that is ready to handle the workload of the Microservices that we will create in the next chapter.

Tip

If you want to learn about Kubernetes, our favourite book on the topic is Kubernetes: Up & Running by Brendan Burns, Joe Beda, and Kelsey Hightower.

Service Mesh

Kubernetes is a feature-packed and powerful tool for managing containter based Microservices systems. But, a lot of practitioners have made a service mesh an essential add-on to their Microservices infrastructure stack. A service mesh is a network of proxies that sit in front of the Microservices that you deploy and help them communicate over the network. For example, a service mesh can help Microservices discover and send messages to other Microservices, secure and access control traffic, log messages and shape traffic across the system.

The truth is that Kubernetes can do all of these things we’ve describe here. But, service meshes have become popular because the load balancing, traffic shaping and communication features they provide is more advanced than what Kubernetes provides out of the box. This becomes particularly important for a Microservices deployment where resiliency patterns, access control and advanced deployment patterns are needed to keep the system running efficiently. Without a good service mesh, we’d have to build a lot of capability directly into our Microservices themselves.

An example of this kind of capability is the circuit breaker pattern that Michael Nygard describes in his book “Release It!” This pattern describes a way of handling the situation of an API or service call failing over the network. Rather than have a Microservice continue to try to connect to a system that is unavailable and having to block and wait for an error result, the circuit breaker pattern prescribes that it should fail fast instead. That means we should implement a small piece of logic that causes the Microservice to receive an error immediately in the case where a previous call has failed. Then, when the backend system is up again, the circuit breaker logic can allow requests through as normal.

This is an example of a resiliency pattern and it’s an important part of building a Microservices system. Small problems can add up quickly when we have lots of services. So, we need solutions that are able to adapt, heal and reduce the blast radius and impact of the problems they encounter. Without a good service mesh we’d need to build this kind of capability into every service that we deploy. But, now we can simply deploy services and use the network normally, knowing that the mesh of proxies will implement the policies and patterns we need to keep traffic running efficiently.

In this chapter, we won’t be doing much with our service mesh, but we will install the Istio service mesh as part of our Kubernetes cluster installation. Later in the book, we’ll take advantage of some of Istio’s features for managing traffic in our Microservices system.

Continuous Delivery Server

In the previous chapter we built a Continuous Integration and Continuous Delivery pipeline for our infrastructure. But, we’ll also need to establish a CI/CD solution for testing, building and deploying our Microservices. We’ll get into the details of that solution later in the book, but in this chapter we’ll install a Kubernetes deployment tool called ArgoCD that we can use later.

ArgoCD is a continuous delivery tool that gives us a user-friendly way to deploy Microservice containers into the Kubernetes cluster we are building in this chapter. It works by connecting to a Git based repository and synchronizing the services running in the cluster with a declarative deployment description.

There are lots of ways of deploying applications to Kubernetes, including just using Kubernetes objects and a CLI. We’ve chose to use ArgoCD for our solution because we like the declarative “GitOps” approach that it takes. This means that the source of truth for our deployment will be managed as declarative code in a Git based repository. That aligns really well with the approach we’ve taken already with our infrastructure. It means that we’ll be able to continue to use Github and Github Actions to manage our code and workflow, with ArgoCD pulling microservice packages from repositories when they are ready to be deployed.

By the end of this chapter, we’ll have created a sandbox environment with an ArgoCD server deployed in an AWS managed Kubernetes cluster, running on a software defined network. It’s going to take a lot of Terraform code to declare all of those parts and that’s what we’ll cover in the next section.

Implementing The Infrastructure

In the previous chapter we established a decision to use Terraform to write the code that defines our infrastructure and Github Actions to test and apply our infrastructure changes. In this section, we’ll break our infrastructure design into discrete Terraform modules and call them from the “sandbox” environment we started building in the previous chapter. We’ll start by reviewing the tools you’ll need in your infrastructure development workspace.

Getting Your Workspace Ready

If you’ve followed along with the instructions in the previous chapter, you should already have the tools you need to write and test Terraform code. That means you should have already done the following:

  1. Installed the AWS command line interface

  2. Created a Terraform user and group in AWS

  3. Configured AWS with your credentials

  4. Installed the Terraform CLI

  5. Created a Github repository for the Sandbox environment

  6. Configured a Github Actions pipeline for the Sandbox environment

If you haven’t setup your Github Action pipeline yet, or you had trouble getting it to work the way we’ve described, you can create a fork of a basic Sandbox environment by following the instructions at https://github.com/implementing-microservices/sandbox-env-starter

In addition to all of the requirements above, you’ll also need to install a command line application for interacting with Kubernetes clusters called kubectl. We’ll be using kubectl to test the environment and make sure that we’ve sucessfully provisioned our Kubernetes cluster and the ArgoCD continuous delivery tool.

Follow the instructions at https://kubernetes.io/docs/tasks/tools/install-kubectl/ to install kubectl in your local system.

With the workspace setup and ready to go, we can move onto writing the Terraform modules that will define the infrastructure.

Setting Up The Module Repositories

When you write professional software, its important to write clean, professional code. When code is too difficult to understand, too difficult to maintain or too difficult to change the project becomes costly to operate and maintain. All of that is true for our infrastructure code as well.

Taking the IaC approach means we’ll need to apply good code practices to our infrastructure project. The good news is that we have lots of existing guidance in our industry on how to write code that is easier to learn, understand and extend. The bad news is that not every principle and practice from traditional software development is going to be easy to implement in the IaC domain. That’s partly because the tooling and languages for IaC are still evolving and partly because the context of changing a live, physical device is a different model from the traditional software development one.

But, with Terraform we’ll be able to apply three essential coding practices that will help us write clean, easier to maintain code:

  • Modularization - Writing small functions that do one thing well

  • Encapsulation - Hiding internal data structures and implementation details

  • Avoid Repetition - Don’t Repeat Yourself (DRY), implementing code once in only one location

Terraform’s built-in support for modules of infrastructure code will help us address all three. We’ll be able to maintain our infrastructure code as a set of re-usable, encapsulated modules. We’ll build modules for each of the architecturally significant parts of our system: networks, the API gateway and the managed Amazon Kubernetes service (called EKS). Once we have our re-usable modules in place, we’ll be able to implement another set of terraform files that use them. We’ll be able to have a different terraform file for each environment that we want to create without repeating the same infrastructure declarations in each one.

TK diagram of modules and environments

This approach allows us to easily spin up new environments by creating new Terraform files that re-use the modules we’ve developed. It also lets us make changes in just one place when we want to change an infrastructure configuration across all environments. We can start by creating a simple module that defines a basic network and an environment file that uses it.

The infrastructure code we are writing in this chapter uses Terraform’s module structure. Each module will have its own directory and will contain a variables.tf, main.tf and output.tf file. The advantage of this approach is that you can define a module once and use it in a parameterised way to build multiple environments. You can learn more about Terraform modules in the Terraform documentation.

We’re going to create two modules for our Microservices infrastructure. First we’ll create an AWS networking module that contains a declarative configuration of our software defined network. We’ll also create a kubernetes module that defines an AWS based managed kubernetes configuration for our environments. We’ll be able to use both of these modules to create our sandbox environment.

Don’t Use Our Configuration Files In Your Production Environment!

We’ve done our best to design an infrastructure that mirrors production environments that large organizations use for Microservices. But, the constraints of the written word prevent us from giving you a comprehensive, all encompassing set of configurations that will work for your particular environment, security needs and constraints. You can use this chapter as a quick starter and guide to the tools you’ll need, but we highly advise that you spend time designing your own production-grade infrastructure, configuration and architecture.

In the previous chapter, we created a github based code repository for the sandbox environment and an associated infrastructure CI/CD pipeline. Now, we’ll need a place to store and mange the Terraform code for the modules that the sandbox Terraform file will be using. One option is to keep the sandbox environment definition and our module code together in the same code repository. For example, we could create a directory for our environments and a directory for our modules and keep all the code in one infrastructure repository. But, there are two reasons we’ll choose to keep each module in it’s own repo:

  1. In the previous chapter, we made a decision to use Github Actions as our CI/CD tool. If we keep multiple environment definitions in a single repository, our workflow files will be more difficult to manage. By keeping each environment in it’s own repository, we can maintain a single, simpler workflow per environment repository.

  2. If our module code is in the same repository as the environment code that uses it, we won’t be able to easily version, tag and manage the modules in an independent manner.

We’ll be creating three Terraform modules that will be used to implement a sandbox environment. That means, we’ll have a total of four Github repositories, one for each module and one for the environment definition.

We’re choosing to use Github hosting for our module repositories because Terraform has a built in primitive to import modules from Github. Github hosting is certainly not the only option for managing Terraform modules but it’s the one that will help us get started as quickly and cheaply as possible. For your next Microservices build, you can decide if you want to host Terraform modules in a different registry.

To get started, we’ll create repositories for all the modules we’ll be writing in this chapter. Go ahead and create three new public Github hosted repositories with the names described in Table 5-1

Table 5-1. Infrastructure Module Names
Repository Name Visibility Description

module-aws-network

Public

A terraform module that creates the network

module-aws-kubernetes

Public

A terraform module that sets up EKS and installs Istio

module-argo-cd

Public

A terraform module that installs ArgoCD into a cluster

Tip

If you aren’t sure how to create a Github repository, you can follow the Github instructions available at https://help.github.com/en/github/getting-started-with-github/create-a-repo.

We recommend that you make these repositories public so that they are easier to import into your Terraform environment definition. You can use private repositories if you prefer - you’ll just have to add some authentication information to your import command so that Terraform can get to the files correctly. You should also add a .gitignore file to all of these repositories so you don’t end up with a lot of Terraform working files pushed to your Github server. You can do that by choosing a Terraform gitignore in the Github web GUI, or you can use the following shell command to create a new .gitignore file in your code repo:

curl https://raw.githubusercontent.com/github/gitignore/master/Terraform.gitignore > .gitignore

With our three Github module repositories created and ready to be populated, we can dive into the work of actually writing the actual infrastructure definitions - starting with the network.

The Network Module

The virtual network is a foundational part of our infrastructure, so it makes sense for us to start by defining the network module. In this section, we’ll write an AWS network module that will support a specific Kubernetes and Microservices architecture and workload. Because it’s a module, we’ll be writing input, main and output code - just like we’d write inputs, logic and return values for an application function. When we’re done, we’ll be able to use this module to easily provision a network environment by specifying just a few input values.

We’ll be writing the network infrastructure code in the module-aws-network Github repository that you created earlier. We’ll be creating and editing Terraform files in the root directory of this module. If you haven’t already done so, clone the repository into your local environment and get your favourite text editor ready.

Tip

A completed listing for this AWS network module is available at https://github.com/implementing-microservices/module-aws-network

Network Module Outputs

Let’s start by defining the resources that we expect the networking module to produce. We’ll do this by creating a Terraform file called output.tf in the root directory of the module-aws-network as follows:

output "vpc_id" {
  value = aws_vpc.main.id
}

output "subnet_ids" {
  value = [
    aws_subnet.public-subnet-a.id,
    aws_subnet.public-subnet-b.id,
    aws_subnet.private-subnet-a.id,
    aws_subnet.private-subnet-b.id]
}

output "public_subnet_ids" {
  value = [aws_subnet.public-subnet-a.id, aws_subnet.public-subnet-b.id]
}

output "private_subnet_ids" {
  value = [aws_subnet.private-subnet-a.id, aws_subnet.private-subnet-b.id]
}

Based on the Terraform module output file, we can see that the network module creates a Virtual Private Cloud (VPC) resource that represents the software defined network for our system. Within that network, our module will also create four logical subnets - these are bounded parts of our network (or subnetworks.) Two of these subnets will be public, meaning that they will be accessible over the Internet. Later, we’ll use all four subnets for our Kubernetes cluster setup and eventually we’ll deploy our Microservices into them.

Network Module Main Configuration

With the output of our module defined, we can start putting together the declarative code that builds it and creates the outputs we are expecting. In a Terraform module, that means we’ll be creating and editing a file named main.tf in the root directory of the module-aws-network repository.

Getting the source code

To help you understand the network implementation, we’ve broken the main.tf source code file into smaller parts. You can find the complete source code listing for this module at https://github.com/implementing-microservices/module-aws-network/blob/master/main.tf

We’ll start our module implementation by creating an AWS Virtual Private Cloud (VPC) resource. Terraform provides us with a special resource for defining AWS VPCs, so we’ll just need to fill in a few parameters to create our definition. When we create a resource in Terraform, we define the parameters and configuration details for it in the Terraform syntax. When we apply these changes, Terraform will make an AWS API call and create the resource if it doesn’t exist already.

Note

You can find all of the Terraform documentation for the AWS provider at Terraform’s documentation site https://www.terraform.io/docs/providers/aws/index.html. You can also consult this documentation if you’re building a similar implementation in GCP or Azure.

Create a file called main.tf in the root of your network module’s repository and add the following Terraform code to your main.tf file to define a new AWS VPC resource:

Example 5-1. modules-aws-network/main.tf
provider "aws" {
  region = var.aws_region
}

locals {
  vpc_name = "${var.env_name} ${var.vpc_name}"
  cluster_name = "${var.cluster_name}-${var.env_name}"
}


## AWS VPC definition
resource "aws_vpc" "main" {
  cidr_block = var.main_vpc_cidr
  tags = {
    "Name"                                        = local.vpc_name,
    "kubernetes.io/cluster/${local.cluster_name}" = "shared",
  }
}

The network module starts with a declaration that it is using the AWS provider. This is a special instruction that lets Terraform know that it needs to download and install the libraries it will need in order to communicate with the AWS API and create resources on our behalf. When we validate or apply this file in Terraform it will attempt to connect to the AWS API using the credentials we’ve configured in the system on as environment variables. We’re also specifying an AWS region here so that Terraform knows which region it should be working in.

We’ve also specified two local variables using a Terraform locals block. These variables define a naming standard that will help us differentiate environment resources in the AWS console. This is especially important if we plan to create multiple environments in the same AWS account space as it will help us avoid naming collisions.

After the local variable declaration, you’ll find the code for creating a new AWS VPC. As you can see, there isn’t much to it, but it does define two important things: a CIDR block and a set of descriptive tags.

The CIDR (Classless Inter-Domain Routing) is a standard way of describing an IP address range for the network. It’s a shorthand string that defines which IP addresses are allowed inside of a network or a subnet. For example, a CIDR value of 10.0.0.0/16 would mean that you could bind to any IP address between 10.0.0.0 - 10.0.255.255 inside the VPC. We’ll be defining a pretty standard CIDR range for you when we build the sandbox environment, but if you want more details on how CIDRs work and why they exist you can read more about them in the RFC at https://tools.ietf.org/html/rfc4632.

We’ve also added some tag values to the VPC. Resource tags are useful because they give us a way of easily identifying groups of resources when we need to administrate them. Tags are also useful for automated tasks and for identifying resources that should be managed in specific ways. In our definition, we have defined a “Name” tag to make our VPC easier to identify. We have also defined a Kuberenetes tag that identifies this cluster as a target for our Kubernetes cluster (which we’ll define later in this chapter).

Also, notice that in a few cases we’ve referenced a variable instead of an actual value in our configuration. For example, our CIDR block is defined as var.main_vpc_cidr and the name of our VPC is tagged as var.project_name. These are Terraform variables and we’ll define their values later when we use this module as part of our sandbox environment. The variables are what makes the modules modular - by changing the variable values we can change the types of environments that we create.

With our main VPC defined, we can move on to configuring the subnets for the network. As we mentioned earlier in this chapter, we’ll be using Amazon’s managed Kubernetes service called EKS to run our workload. In order for EKS to function properly, we’ll need to have subnets defined in two different availability zones for availability and fault tolerance. Amazon also recommends that we configure our VPC with both public and private subnets, so that Kubernetes can create load balancers in the public subnets that manage traffic in the private subnets.

To meet those requirements, we’ll define a total of four subnets. Two of them will be designated as public subnets, which means that they will be accessible over the web. The other two subnets will be private. We’ll also split our public and private subnets up so that they are deployed in separate availability zones. When we are done, we’ll have a network that looks like the network depicted in Figure 5-1.

AWS Subnet Design
Figure 5-1. AWS Subnet Design

We’ve already specified a CIDR for the IP range in our VPC, now we’ll need to split those IP addresses up and for the subnets to use. Since the subnets are inside of the VPC, they’ll need to have a CIDR that is within the boundaries of the VPC IP range. We won’t actually be defining those IP addresses in our module though. Instead, we’ll use variables just like we did for the VPC.

In addition to the CIDR blocks, we’ll specify the availability zones for our subnets as a parameter. Rather than hardcoding the name of the availability zone, we’ll use a special terraform type called data that will let us dynamically choose the zone name. In this case, we’ll put public-subnet-a and private-subnet-a in data.aws.availability_zones.available.names[0] and public-subunet-b and private-subnet-b in data.aws.availability_zones.available.names[1]. Using dynamic data like this makes it easier for us to spin up this infrastructure in different regions.

Finally, we’ll add a name tag so that we can easily find our network resources through the admin and ops consoles. We’ll also add need to add some EKS tags to the subnet resources so that our AwS Kubernetes service will know which subnets we are using and what they are for. We’ll tag our public subnets with an elb role so that EKS knows it can use that subnet to create and deploy an elastic load balancer. We’ll tag the private subnets with an internal-elb role to indicate that our workloads will be deployed into it and can be load balanced. For more details on how AWS EKS uses load balancer tags, you can consult the documentation at https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html.

Add the following Terraform code to the end of your main.tf file in order to declare the subnet configuration:

# subnet definition

data "aws_availability_zones" "available" {
  state = "available"
}

resource "aws_subnet" "public-subnet-a" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.public_subnet_a_cidr
  availability_zone = data.aws_availability_zones.available.names[0]

  tags = {
    "Name"                                        = "${local.vpc_name}-public-subnet-a"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }
}


resource "aws_subnet" "public-subnet-b" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.public_subnet_b_cidr
  availability_zone = data.aws_availability_zones.available.names[1]

  tags = {
    "Name"                                        = "${local.vpc_name}-public-subnet-b"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }
}


resource "aws_subnet" "private-subnet-a" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnet_a_cidr
  availability_zone = data.aws_availability_zones.available.names[0]


  tags = {
    "Name"                                        = "${local.vpc_name}-private-subnet-a"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}

resource "aws_subnet" "private-subnet-b" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnet_b_cidr
  availability_zone = data.aws_availability_zones.available.names[1]

  tags = {
    "Name"                                        = "${local.vpc_name}-private-subnet-b"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}
Tip

In Terraform, a data element is a way of querying the provider for information. In the network module, we’re using the “aws_availability_zones” data element to ask AWS for availability zone IDs in the region we’ve specified. This is a nice way to avoid hardcoding values into the module.

Although we’ve configured four subnets and their IP ranges, we haven’t yet defined the network rules that AWS will need to manage traffic through them. To finish our network design, we’ll need to implement a set of routing tables that define what traffic sources we will allow into our subnets. For example, we’ll need to establish how traffic will be routed through our public subnets and how each of the subnets will be allowed to communicate with each other.

We’ll start by defining the routing rules for our two public subnets: public-subnet-a and public-subnet-b. To make these subnets accessible on the internet, we’ll need to add a special resource to our VPC called an Internet Gateway. This is an AWS network component that connects our private cloud to the public internet. Terraform gives us a resource definition for the gateway, so we’ll use that and tie it to our VPC with the vpc_id configuration parameter.

Once we’ve added the Internet Gateway, we’ll need to define routing rules that let AWS know how to route traffic from the gateway into our subnets. To do that, we’ll create an aws_route_table resource that allows all traffic from the Internet (which we’ll identify with CIDR block 0.0.0/0), through the gateway. Then we just need to create associations between our two public subnets and the table we’ve just defined.

Add the following Terraform code to main.tf to define routing instructions for our network:

# Internet gateway and routing tables for public subnets
resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${local.vpc_name}-igw"
  }
}

resource "aws_route_table" "public-route" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
    "Name" = "${local.vpc_name}-public-route"
  }
}

resource "aws_route_table_association" "public-a-association" {
  subnet_id      = aws_subnet.public-subnet-a.id
  route_table_id = aws_route_table.public-route.id
}


resource "aws_route_table_association" "public-b-association" {
  subnet_id      = aws_subnet.public-subnet-b.id
  route_table_id = aws_route_table.public-route.id
}

With the routes for our public subnets defined, we can dive into the setup for our two private subnets. The route configuration for the private subnets will be a bit more complicated than what we’ve done so far. That’s because we’ll need to define a route from our private subnet out to the Internet to allow our Kubernetes pods to talk to the EKS service.

For that kind of route to work, we’ll need a way for nodes in our private subnet to talk to the Internet Gateway we’ve deployed in the public subnets. In AWS, that means we need to create a NAT (Network Address Translation) Gateway resource that gives us a path out. When we create the NAT, we’ll also need to assign it a special kind of IP address called an elastic IP address (or EIP). This is an AWS construct that means the IP is a real Internet accessible network address, unlike all of the other addresses in our network which are virtual and exist inside of AWS alone. Since real IP addresses aren’t unlimited, AWS limits the amount of these available. Unfortunately, we can’t create a NAT without one, so we’ll have to use two of them - one for each NAT we are creating.

Add the following Terraform code to implement a NAT gateway in our network.

resource "aws_eip" "nat-a" {
  vpc = true
  tags = {
    "Name" = "${local.vpc_name}-NAT-a"
  }
}

resource "aws_eip" "nat-b" {
  vpc = true
  tags = {
    "Name" = "${local.vpc_name}-NAT-b"
  }
}

resource "aws_nat_gateway" "nat-gw-a" {
  allocation_id = aws_eip.nat-a.id
  subnet_id     = aws_subnet.public-subnet-a.id
  depends_on    = [aws_internet_gateway.igw]


  tags = {
    "Name" = "${local.vpc_name}-NAT-gw-a"
  }
}

resource "aws_nat_gateway" "nat-gw-b" {
  allocation_id = aws_eip.nat-b.id
  subnet_id     = aws_subnet.public-subnet-b.id
  depends_on    = [aws_internet_gateway.igw]


  tags = {
    "Name" = "${local.vpc_name}-NAT-gw-b"
  }
}

In addition to the NAT gateway we’ve created, we’ll need to define routes for our private subnets. Add the following Terraform code to main.tf to complete the definition of our network routes.

modules/network/main.tf - route tables

resource "aws_route_table" "private-route-a" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.nat-gw-a.id
  }

  tags = {
    "Name" = "${local.vpc_name}-private-route-a"
  }
}


resource "aws_route_table" "private-route-b" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.nat-gw-b.id
  }

  tags = {
    "Name" = "${local.vpc_name}-private-route-b"
  }
}


resource "aws_route_table_association" "private-a-association" {
  subnet_id      = aws_subnet.private-subnet-a.id
  route_table_id = aws_route_table.private-route-a.id
}


resource "aws_route_table_association" "private-b-association" {
  subnet_id      = aws_subnet.private-subnet-b.id
  route_table_id = aws_route_table.private-route-b.id
}

That’s it for our main network definition. When we eventually run this Terraform file, we’ll end up with an AWS software defined network that is ready for Kubernetes and our Microservices. But, before we can use it, we’ll need to define all of the input variables that this module needs. Although, we’ve referenced a lot of var values in our code, Terraform modules require us to identify all of the input variables we’ll be using in a specific file called variables.tf. If we don’t do that, we won’t be able to pass variable values into our module.

Network Module Variables

Create a file in the root folder of the network module called variables.tf. Add the following Terraform code to variables.tf to define the inputs for the module:

variable "env_name" {
  type = string
}

variable "aws_region" {
  type = string
}

variable "vpc_name" {
  type    = string
  default = "ms-up-running"
}

variable "main_vpc_cidr" {
  type = string
}

variable "public_subnet_a_cidr" {
  type = string
}

variable "public_subnet_b_cidr" {
  type = string
}

variable "private_subnet_a_cidr" {
  type = string
}

variable "private_subnet_b_cidr" {
  type = string
}

variable "cluster_name" {
  type    = string
}

As you can see, the variable definitions are fairly self-explanatory. They describe a name, optional description and a type value. In our module we’re only using string values. In some cases, we’ve also provided a default value so that those inputs don’t always have to be defined for every environment. We’ll give the module values for those variables when we use it to create an environment.

Note

It’s good practice to include a description attribute for every variable in your Terraform module. This improves maintainability and usability of your modules and becomes increasingly important over time. We’ve omitted the descriptions in all our module variables to save space in the book.

The Terraform code for our network module is now complete. At this point, you should have a list of files that looks something like this in your module directory:

drwxr-xr-x   3 msur  staff   96 14 Jun 09:57 ..
drwxr-xr-x   7 msur  staff  224 14 Jun 09:58 .
-rw-r--r--   1 msur  staff   23 14 Jun 09:57 README.md
drwxr-xr-x  13 msur  staff  416 14 Jun 09:57 .git
-rw-r--r--   1 msur  staff    0 14 Jun 09:58 main.tf
-rw-r--r--   1 msur  staff  612 14 Jun 09:58 variables.tf
-rw-r--r--   1 msur  staff   72 14 Jun 09:58 outputs.tf

With the code all written, we’ll be testing the network module out by creating a sandbox environment network, but before we use the module we should make sure we haven’t made any syntax errors. The Terraform command line application includes some handy features to format and validate code. If you haven’t already installed the Terraform client in your local system, you can find a binary for your operating system at https://www.terraform.io/downloads.html.

Use the following Terraform command while you are in your module’s working directory to format the module’s code

module-aws-network$ terraform fmt

The fmt command will “lint” or format all Terraform code in the working directory and ensure that it conforms to a set of built-in style guidelines. It will automatically make those changes for you and will list any files that have been updated.

Next, run teraform init so that Terraform can install the AWS provider libraries. We need to do this so that we can validate the code. Note, that you’ll need to have AWS credentials defined for this to work. If you haven’t done that yet, follow the instructions in the previous chapter.

module-aws-network$ terraform init

Finally, you can run the validate command to make sure that our module is syntatically correct. If you run into any problems, try to fix those before you continue.

module-aws-network$ terraform validate
Success! The configuration is valid.
Tip

If you need to debug your Terraform code, you can set the environment variable TF_LOG to INFO or DEBUG. That will instruct Terraform to emit logging info to standard output.

When you are satisfied that the code is formatted and valid, you can commit your changes to the Github repository. If you’ve been working in a local repository, you can use the following command to push your changes to the main repository:

module-aws-network$ git add .
module-aws-network$ git commit -m "network module created"
[master ddb7e41] network module created
 3 files changed, 226 insertions(+)
module-aws-network$ git push

Our Terraform based network module is now complete and available for use. It has a variables.tf file that describes the required and optional input variables to use it. It has a main.tf file that declaratively defines the resources for our network design. Finally, it has an outputs.tf that defines the significant resources that we’ve created in the module. Now, we can use the module to create a real network in our sandbox environment.

Create a Sandbox Network

The nice thing about using Terraform modules is that we can create our environments easily in a repeatable way. Outside of the specific values we’ve defined in the variables.tf file, any environment that we create with the module we’ve defined will operate with a network infrastructure that we know and understand. That means we can expect our microservices to work in a predictable way as we move them through testing and release environments since we’ve reduced the level of variation.

But, to apply the module we’ve defined and create a new environment, we’ll need to call it from a Terraform file that defines values for the module’s variables. To do that, we’ll create a “sandbox” environment that demonstrates a practical example of using a Terraform module. If you followed the steps in the previous chapter (TK link), you’ll already have a code repository for your sandbox environment with a single main.tf file in it.

In order to use the network module that we’ve created, we’ll use a special Terraform resource called module. It allows us to reference a Terraform module that we’ve created and pass in values for the variables that we’ve defined. In order to use our AWS network module to create a sandbox environment, you’ll need to determine values for the variables as shown in in Table 5-2:

Table 5-2. Sandbox Environment Network Variables
Name Description Example

source

The URL of your github hosted network module

github.com/implementing-microservices/module-aws-network

aws_region

The AWS region where your network will be deployed

eu-west-1

Once you’ve decided on the values for your variable inputs, update your sandbox’s main.tf file so that it looks like the following code, with your values and S3 bucket name populated:

terraform {
  backend "s3" {
    bucket = "bucket name"
    key    = "terraform/backend"
    region = "region code"
  }
}

locals {
  env_name         = "sandbox"
  aws_region       = "region code"
  k8s_cluster_name = "ms-cluster"
}

module "aws-network" {
  source = "github.com/implementing-microservices/module-aws-network"
  aws_region = "region code"
}
Note

Amazon’s S3 bucket names must be globally unique, so you’ll need to change the value of bucket to something that is unique and meaningful for you. Refer to the previous chapter for instructions on how to setup the backend. If you just want to do a quick and dirty test, you can omit the backend definition and Terraform will store state locally in your file system.

Our infrastructure pipeline will apply Terraform changes, but before we kick it off we need to check to make sure that the Terraform code we’ve written will work. A good first step is to format and validate the code locally:

$ terraform fmt
[...]
$ terraform init
[...]
$ terraform validate
Success! The configuration is valid.
Tip

If you need to debug the networking module and end up making code changes, you may need to run the following command in your sandbox environment directory:

$ terraform get -update

This will instruct terraform to pull the latest version of the network module from Github.

If the code is valid, we can get a plan to validate the changes that Terraform will make when they are applied. It’s always a good idea to do a dry run and examine the changes that will be made before you actually change the environment, so make this step a part of your workflow. To get the terraform plan, run this command:

$ terraform plan

Terraform will provide you with a list of the resources that will be created, deleted and updated. Look through the plan and make sure the output makes sense. When you are happy that the plan make sense, you can push the code to the Github repository and tag it for release:

$ git add .
$ git tag -a v1.0 -m "network build"
$ git commit -m "initial network release"
$ git push origin v1.0

With the code tagged and pushed, our Github Action pipeline should take over and start building the network for our Sandbox environment. You’ll need to login to Github and check the Actions tab in your sandbox environment repository to make sure that everything goes according to plan. If you don’t remember how to do that, you’ll find instructions in the previous chapter. If everything goes well, you should now have an AWS network running and ready to use. That means it’s time to start writing the module for the Kubernetes service.

The Kubernetes and Istio Module

One of the most important parts of our Microservices infrastructure is the Kubernetes layer that orchestrates our container based services. If we set it up correctly, Kubernetes will give us an automated solution for resiliency, scaling and fault tolerance. It will also give us a great foundation for deploying our services in a dependable way. On top of that an Istio service mesh gives us a powerful way of managing traffic and improving the way our microservices communicate.

To build our Kubernetes module, we’ll follow the same steps that we did to build our Network module. We’ll start by defining a set of output variables the define what the module will produce, then we’ll write the code that declaratively defines the configuration that Terraform will create, finally we’ll define the inputs. As we mentioned earlier in this chapter, we are managing each of infrastructure modules in it’s own Github code repository. So, make sure you start by creating a new Github repository for our Kubernetes module if you haven’t done that already.

Implementing Kubernetes can get very complicated. So, to get our system up and running as quickly as possible, we’ll use a managed service that will hide some of the setup and management complexity for us. Since we are running on AWS in our examples, we’ll use the Elastic Kubernetes Services that is bundled in Amazon’s cloud offering. Unfortunately, the configuration for managed Kubernetes services tends to be very vendor specific, so the examples we provide here will take some re-work if you want to use them in another solution. But, you should be able to find examples that can adapt into your solution on the web.

An EKS cluster contains two parts: a control-plane that hosts Kubernetes system software and a node group that hosts the virtual machines that our microservices will run on. In order to configure EKS, we’ll need to provide parameters for both of these areas. When the module is finished running we can return an EKS cluster identifier so that we have the option of inspecting or adding to the cluster with other modules.

With all that in mind, let’s dive into the code that will make it come to life. We’ll be working in the module-aws-kubernetes Github repository that you created earlier, so make sure you start by cloning it into your local machine. When you’ve done that, we can begin by editing the Terraform outputs file.

Tip

A completed listing for this AWS network module is available at https://github.com/implementing-microservices/module-aws-kubernetes

Kubernetes Module Outputs

We’ll start by declaring the outputs that our module provides. create a Terraform file called outputs.tf in the root directory of the module-aws-kubernets reopsitory and add the following code to it:

output "eks_cluster_id" {
  value = aws_eks_cluster.ms-up-running.id
}


output "eks_cluster_name" {
  value = aws_eks_cluster.ms-up-running.name
}


output "eks_cluster_certificate_data" {
  value = aws_eks_cluster.ms-up-running.certificate_authority.0.data
}


output "eks_cluster_endpoint" {
  value = aws_eks_cluster.ms-up-running.endpoint
}


output "eks_cluster_nodegroup_id" {
  value = aws_eks_node_group.ms-node-group.id
}

The main value we’re returning is the identifier for the EKS cluster that we’ll be creating in this module. The rest of the values need to be returned so that we can access the cluster from other modules once the cluster is stood up and operational. For example, we’ll need the endpoint and certificate data when we install the ArgoCD server into this EKS cluster at the end of the chapter.

While the output of our module is pretty simple, the work of getting our EKS based Kubernetes system setup is going to be a bit more complicated. Just like we did before, we’ll build the module’s main Terraform file in parts before we test it and apply it.

Defining the EKS Cluster

To begin with, create a Terraform file called main.tf in the root directory of your Kubernetes module and add an AWS provider definition:

provider "aws" {
  region = var.aws_region
}

Remember that we’ll be using the Terraform naming convention of var to indicate values that can be replaced by variables when our module is invoked.

As we mentioned earlier, we’re going to use Amazon’s EKS to create and manage our Kubernetes installation. But, EKS will need to create and modify AWS resources on our behalf in order to run. So, we’ll need to setup permissions in our AWS account so that it can do the work it needs to do. We’ll need to define policies and security rules at the overall cluster level and also for the virtual machines or nodes thatthat EKS will be spinning up for us to run Microservices on.

We’ll start by focusing on the rules and policies for the entire EKS cluster. Add the following Terraform code to your main.tf file to define a new cluster access management policy:

locals {
  cluster_name = "${var.cluster_name}-${var.env_name}"
}


resource "aws_iam_role" "ms-cluster" {
  name = local.cluster_name

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "ms-cluster-AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.ms-cluster.name
}

The snippet above establishes a trust policy that allows the AWS EKS service to act on your behalf. It defines a new identity and access management role for our EKS service and attaches a policy called AmazonEKSClusterPolicy to it. This policy has been defined by AWS for us and gives the EKS the permissions it needs to create virtual machines and make network changes as part of its Kubernetes management work. Notice that we are also defining and using a local variable for the name of the cluster. We’ll use that variable throughout the module.

Now that the cluster service’s role and policy are defined, add the following code to your module’s main.tf file to define a network security policy for the cluster:

resource "aws_security_group" "ms-cluster" {
  name        = "ms-up-running-cluster"
  vpc_id      = var.vpc_id

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "ms-up-running"
  }
}

A VPC security group restricts the kind of traffic that can go into and out of the network. The Terraform code we’ve just written defines an “egress” rule that allows unrestricted outbound traffic, but doesn’t allow any inbound traffic because there is no “ingress” rule defined. Notice that we are applying this security group to a VPC that will be defined by an input variable. When we use this module, we can give it the ID of the VPC that our networking module has created.

With these policies and a security group defined for the EKS cluster, we can now add the declaration for the cluster itself to the main.tf Terraform file:

resource "aws_eks_cluster" "ms-up-running" {
  name     = local.cluster_name
  role_arn = aws_iam_role.ms-cluster.arn

  vpc_config {
    security_group_ids = [aws_security_group.ms-cluster.id]
    subnet_ids         = var.cluster_subnet_ids
  }

  depends_on = [
    aws_iam_role_policy_attachment.ms-cluster-AmazonEKSClusterPolicy
  ]
}

The EKS cluster definition we’ve just created is pretty simple. It simply references the name, role, policy and security group values we’ve defined earlier. It also references a set of subnets that the cluster will be managing. These subnets will be the ones that we created earlier in the networking module and we’ll be able to pass them in to this Kubernetes module as a variable.

When AWS creates an EKS cluster, it automatically sets up all of the management components that we need to run our Kubernetes cluster. This is called the “control plane” and it’s the brain of our Kubernetes system. But, in addition to the control plane, our Microservices need a place where they can run. In Kubernetes, that means we need to setup nodes - the physical or virtual machines that containerized workloads can run on.

One of the advantages of using a managed Kubernetes service like EKS is that we can off-load some of the work of managing the creation, removal and updates of Kubernetes nodes. Four our configuration, we’ll define a managed EKS node group and let AWS provision resources and interact with the Kubernetes system for us. But, to get a managed node group running, we’ll still need to define a few important configuration values.

Defining the EKS Node Group

Just like we did for our cluster, we’ll begin the node configuration by defining a role and some security policies. Add the following node group IAM definitions to our Kubernetes module’s main.tf file:

modules/eks/main.tf

# Node Role
resource "aws_iam_role" "ms-node" {
  name = "${local.cluster_name}.node"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

# Node Policy
resource "aws_iam_role_policy_attachment" "ms-node-AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = aws_iam_role.ms-node.name
}

resource "aws_iam_role_policy_attachment" "ms-node-AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = aws_iam_role.ms-node.name
}

resource "aws_iam_role_policy_attachment" "ms-node-AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = aws_iam_role.ms-node.name
}

The role and policies in the Terraform snippet above will allow any nodes that are created to communicate with Amazon’s container registries and virtual machine services. We need these policies, because the nodes in our Kubernetes system will need to be able to provision computing resources and access containers in order to run services. You can find more details on the IAM role for EKS worker nodes in the AWS EKS documentation at https://docs.aws.amazon.com/eks/latest/userguide/worker_node_IAM_role.html.

Now that we have our node’s role and policy resources defined, we can write the declaration for a node group that uses them. In EKS, a managed node group needs to specify the types of compute and storage resources it will use along with some defined limits for the number of individual nodes or VMs that can be created automatically. This is important because we are letting EKS automatically provision and scale our nodes. We don’t want to inadvertently consume massive amounts of AWS resources and end up with a correspondingly massive bill.

We could hardcode all of these parameters in our module, but instead we’ll use input variables as values for the size limits, disk size and CPU types. That way we’ll be able to use the same Kubernetes module to create different kinds of environments. For example, a development environment can be setup to use minimal resources, while a production environment can be more robust.

Add the following Terraform code to the end of the module’s main.tf file to define our EKS node group:

modules/eks/main.tf

resource "aws_eks_node_group" "ms-node-group" {
  cluster_name    = aws_eks_cluster.ms-up-running.name
  node_group_name = "microservices"
  node_role_arn   = aws_iam_role.ms-node.arn
  subnet_ids      = var.nodegroup_subnet_ids

  scaling_config {
    desired_size = var.nodegroup_desired_size
    max_size     = var.nodegroup_max_size
    min_size     = var.nodegroup_min_size
  }

  disk_size      = var.nodegroup_disk_size
  instance_types = var.nodegroup_instance_types

  depends_on = [
    aws_iam_role_policy_attachment.ms-node-AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.ms-node-AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.ms-node-AmazonEC2ContainerRegistryReadOnly,
  ]
}

The node group declaration is the last part of our EKS configuration. We have enough here to be able to call this module from our sandbox environment and instantiate a running Kubernetes cluster on the AWS EKS service. Our module’s outputs will return the values that are needed to connect to the nodegroup once it’s running. But, it’s also useful to provide those connection details in a configuration file for the kubectl CLI that most operators use for Kubernetes management.

To create a Kubernetes configuration file, append the following code to your module’s main.tf file:

# Create a kubeconfig file based on the cluster that has been created
resource "local_file" "kubeconfig" {
  content  = <<KUBECONFIG_END
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ${aws_eks_cluster.ms-up-running.certificate_authority.0.data}
    server: ${aws_eks_cluster.ms-up-running.endpoint}
  name: ${aws_eks_cluster.ms-up-running.arn}
contexts:
- context:
    cluster: ${aws_eks_cluster.ms-up-running.arn}
    user: ${aws_eks_cluster.ms-up-running.arn}
  name: ${aws_eks_cluster.ms-up-running.arn}
current-context: ${aws_eks_cluster.ms-up-running.arn}
kind: Config
preferences: {}
users:
- name: ${aws_eks_cluster.ms-up-running.arn}
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      command: aws-iam-authenticator
      args:
        - "token"
        - "-i"
        - "${aws_eks_cluster.ms-up-running.name}"
    KUBECONFIG_END
  filename = "kubeconfig"
}

This code looks complicated, but it’s actually fairly simple. We are using a special Terraform resource called local_file to create a file named “kubeconfig”. We are then populating kubeconfig with YAML content that defines connection parameters for our Kubernetes cluster. Notice that we are getting the values for the YAML file from the EKS resources that we created in the module.

When Terraform runs this block of code it will create a “kubeconfig” file in a local directory. We’ll be able to use that file to connect to the Kubernetes environment from CLI tools. We made a special provision for this file when we built our pipeline in the previous chapter. When you run the infrastructure pipeline you’ll be able to download this populated configuration file and use it to connect to the cluster.

The configuration file will make it a lot easier for us to connect to the cluster from your machine. But, we’re also going to use it for Istio installation. That’s the next bit of code we need to add to the Terraform file.

Installing Istio

If you recall, from the beginning of this chapter, Istio service mesh is a system of micro-proxies and management components that will help us secure, stabilize, shape and see all of the communication that is happening between our services. We need to bootstrap the Istio install now, so we can use the service mesh later in this book in Chapter TK, when we deploy Microservices.

Under the hood, Istio’s architecture is pretty complicated with a lot of moving parts. Fortunately, Istio provides an easy to use CLI application called istio_ctl to do the install with a single command line instruction. Unfortunately, as of this writing there isn’t a good, native Terraform way of installing the custom extensions that Isio needs for operation. Instead, we’ll have to do something that is a bit of a “hack” - we’ll need to run Istio’s command line application from our Terraform module.

The Istio CLI application is pretty straightforward, but it needs to be able to talk to the EKS cluster that we are creating in order to run. It needs to know the network address of our newly created cluster and it needs to have the right credentials to get access to the cluster’s management API. The good news is that it understands the kubeconfig format of the file we’ve just created, so we just need to call it and point it at the config file.

Add the following code to your main.tf file, to install Istio into the cluster:

resource "null_resource" "istio-install" {
  # Reinstall istio if the cluster is changed
  triggers = {
    cluster_id = aws_eks_cluster.ms-up-running.id
  }

  # Make sure that the EKS node group is running before we try to install Istio
  depends_on = [aws_eks_node_group.ms-node-group]

  provisioner "local-exec" {
    command = "istioctl install -y --kubeconfig kubeconfig"
  }
}

Since we are using a command line application application to do the Istio installation, there isn’t a specific Terraform resource defined for us to use. Instead, we’re using special Terraform resource called a “null_resource”. This means that the resource is generic and Terraform will run the expressions inside it without tying it to any specific provider’s API. Because this is a generic resource, we’ve had to define a “triggers” property that lets Terraform know that “istio-install” depends on the EKS cluster resource finishing before it gets called. It also means that if the EKS cluster is destroyed and recreated, this code will be run again.

The “local-exec” provisioner resource is a special object that instructs Terraform to run a command in the local operating system. Unlike the rest of the Terraform resources we’ve been using, a provisioner is not a declarative statement. That means, instead of identifying a target state and letting Terraform do the work of making the appropriate changes, provisioner statements just get blindly invoked. In this case, we are using it to call istioctl with arguments for installation and the location of our kubeconfig file.

Warning

Provisioners are a last resort and they break Terraform’s declarative model. Use them sparingly and only when you absolutely must!

Of course, for a local call to istioctl to work means that we’ll need it installed locally before the Terraform module is run. This isn’t ideal as it introduces a special pre-condition into our build, but it’s worth the tradeoff of getting Istio installed quickly and easily. In the previous chapter, we introduced a special installation step for installing istioctl as part of our infrastructure pipeline. Now, you know why!

There’s one last thing we’ll need to do to finish our Istio installation: create a namespace with a special Istio label..

Kubernetes is object based and most of the work you’ll do with it involves creating declarative objects in the system that let Kubernetes know what the system is supposed to look like. To keep those objects organised, Kubernetes provides a namespace management facility so that you can avoid naming conflicts and easily find and update them.

We’re going to create a namespace for the objects that we’ll be using to describe our Microservices deployments. We’ll use this namespace later when we deploy our services into the stack. But, we’ll also be adding a special bit of metadata that lets Kubernetes know the service should be part of Istio’s service mesh.

Add the following code to your Kubernetes module’s main.tf file to create the namespace with the Istio injection label:

provider "kubernetes" {
  load_config_file       = false
  cluster_ca_certificate = base64decode(aws_eks_cluster.ms-up-running.certificate_authority.0.data)
  host                   = aws_eks_cluster.ms-up-running.endpoint
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws-iam-authenticator"
    args        = ["token", "-i", "${aws_eks_cluster.ms-up-running.name}"]
  }
}

# Create a namespace for microservice pods and label it for automatic sidecar injection
resource "kubernetes_namespace" "ms-namespace" {

  # Make sure that the EKS node group is running before we try to install Istio
  depends_on = [aws_eks_node_group.ms-node-group]
  metadata {
    labels = {
      istio-injection = "enabled"
    }
    name = var.ms_namespace
  }
}

The code above declares a new namespace by using Terraform’s Kubernetes provider. Remember that a Terraform provider is like a plug-in that tells Terraform how to talk to a platform or system. In this case, Terraform provides a provider to perform declarative operations on Kubernetes clusters. We’re not going to use Terraform to manage our Kubernetes cluster once it’s up, but we are going to use it to bootstrap our namespace.

To configure the Kubernetes provider, we’re using the properties of the EKS cluster that we provisioned earlier in the module. This is quite similar to what we did earlier when we created the kubeconfig file. With the provider configured, we can use the provider’s “kubernetes_namespace” resource to declare our new namespace and our special istio-injection label. We’ve also use Terraform’s depends_on attribute to ensure that we don’t try to create the namespace before the EKS node group is running and ready to be used.

With the code we’ve written in main.tf, Terraform will instantiate an EKS cluster using AWS API calls and then it will call our local provisioners to install a default Istio profile into the cluster with a Kubernetes namespace. We’re almost done writing our module. All that’s left is to define the variables.

Kubernetes Module Variables

To declare the variables for our Kubernetes module, create a file called variables.tf in your moudle-aws-kubernetes repository and add the following code:

variable "aws_region" {
  type        = string
  default     = "eu-west-2"
}

variable "env_name" {
  type = string
}

variable "cluster_name" {
  type = string
}

variable "ms_namespace" {
  type    = string
  default = "microservices"
}

variable "vpc_id" {
  type = string
}

variable "cluster_subnet_ids" {
  type = list(string)
}

variable "nodegroup_subnet_ids" {
  type = list(string)
}

variable "nodegroup_desired_size" {
  type    = number
  default = 1
}

variable "nodegroup_min_size" {
  type    = number
  default = 1
}

variable "nodegroup_max_size" {
  type    = number
  default = 5
}

variable "nodegroup_disk_size" {
  type = string
}

variable "nodegroup_instance_types" {
  type = list(string)
}

Our AWS Kubernetes module is now fully written. As we did for our network module, we’ll take a moment to clean up the formatting and validate the syntax of the code by running the following Terraform commands:

module-aws-kubernetes$ terraform fmt
[...]
module-aws-kubernetes$ terraform init
[...]
module-aws-kubernetes$ terraform validate
Success! The configuration is valid.

When you are satisfied that the code is valid, commit your changes and push them to Github so that we can use this module in the Sandbox environment.

Create a Sandbox Kubernetes Cluster

Now that our complex Kubernetes system and Istio service mesh are wrapped up in a simple module, the work of setting it up in our Sandbox enviroment is pretty simple. All we’ll need to do is call our module with the input parameters that we want. Remember that our Sandbox environment is defined in it’s own code repository and has it’s own Terraform file called main.tf that we’ve used to setup the network. We’ll be editing that file again, but this time we’ll add a call to the terraform module.

If you recall, we gave some of our input variables default values. To keep things simple, we’ll just use those default values in our sandbox environment. We’ll also need to pass some of the output variables from our network module into this Kubernetes module so that it installs the cluster on the network we’ve just created. But, beyond those inputs, you’ll need to define the aws_region value for your installation. This should be the same as the value you used for the network module and the backend configuration. You’ll also need to set the source parameter to point to your Github hosted module.

Update the main.tf file of your sandbox environment so that it looks like the following, with the appropriate values replaced:

terraform {
  backend "s3" {
    bucket = "bucket name"
    key    = "terraform/backend"
    region = "region code"
  }
}

locals {
  env_name         = "sandbox"
  aws_region       = "eu-west-2"
  k8s_cluster_name = "ms-cluster"
}


module "aws-network" {
  source = "github.com/implementing-microservices/module-aws-network"


  env_name              = local.env_name
  vpc_name              = "msur-VPC"
  cluster_name          = local.k8s_cluster_name
  aws_region            = local.aws_region
  main_vpc_cidr         = "10.10.0.0/16"
  public_subnet_a_cidr  = "10.10.0.0/18"
  public_subnet_b_cidr  = "10.10.64.0/18"
  private_subnet_a_cidr = "10.10.128.0/18"
  private_subnet_b_cidr = "10.10.192.0/18"
}


module "aws-kubernetes-cluster" {
  source = "github.com/implementing-microservices/module-aws-kubernetes"

  ms_namespace       = "microservices"
  env_name           = local.env_name
  aws_region         = local.aws_region
  cluster_name       = local.k8s_cluster_name
  vpc_id             = module.aws-network.vpc_id
  cluster_subnet_ids = module.aws-network.subnet_ids


  nodegroup_subnet_ids     = module.aws-network.private_subnet_ids
  nodegroup_disk_size      = "20"
  nodegroup_instance_types = ["t3.medium"]
  nodegroup_desired_size   = 1
  nodegroup_min_size       = 1
  nodegroup_max_size       = 3
}

Now, you should be able to commit and push this file into your CI/CD infrastructure pipeline and create a working EKS cluster as well as an Istio service mesh installation. But, be prepared to wait for a few minutes for a result. Provisioning an EKS cluster can take up to 10-15 minutes. When it’s done you’ll have a powerful infrastructure ready to run your Microservices resiliently.

Warning

The AWS EKS cluster we’ve defined here will accrue charges even when it’s idle. We recommend that you destroy the environment when you are not using it. You’ll find instructions for doing that at the end of this chapter.

Our final step will be to install a deployment tool that will come in handy when it’s time to release our services into our environment’s Kubernetes cluster.

Setting up Argo CD

As we mentioned in the beginning of this chapter, we’re going to end our infrastructure setup by installing a continuous delivery server that we’ll be using later on. We’ll continue to follow the module pattern by creating a Terraform module for ArgoCD that we can call to bootstrap the server in our sandbox environment. We’ll be installing this resource into the Kubernetes cluster rather than onto the AWS platform. That means we won’t be using the AWS provider. Instead, we’ll use Terraform’s Kubernetes and Helm providers.

Note

A completed version of this module is available at https://github.com/implementing-microservices/module-argo-cd

Create a file called main.tf file in the root directory of the module-argo-cd git repository that you created earlier. Add the following code to setup the providers we need for the installation:

provider "kubernetes" {
  load_config_file       = false
  cluster_ca_certificate = base64decode(var.kubernetes_cluster_cert_data)
  host                   = var.kubernetes_cluster_endpoint
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws-iam-authenticator"
    args        = ["token", "-i", "${var.kubernetes_cluster_name}"]
  }
}

provider "helm" {
  kubernetes {
    load_config_file       = false
    cluster_ca_certificate = base64decode(var.kubernetes_cluster_cert_data)
    host                   = var.kubernetes_cluster_endpoint
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      command     = "aws-iam-authenticator"
      args        = ["token", "-i", "${var.kubernetes_cluster_name}"]
    }
  }
}

We describe the Terraform Kubernetes provider earlier when we performed the Istio installation. This block is largely the same, with the exception that our connection values will need to come from module variables rather than from the EKS resource directly. Just like we did before, we’ll be using the Kubernetes provider to declare a new namespace.

But, notice that we’ve also introduced and configured a new provider called “helm”. Helm is a popular way of describing a Kubernetes deployment and for distributing Kubernetes applications. It’s similar to other package management tools such as apt-get in the Linux world and is designed to make installation of Kubernetes based applications simple and easy. In order to configure our Helm provider, we simply need to provide a few Kubernetes connection parameters.

A Helm deployment is called a chart. We’ll be using a helm chart provided by the ArgoCD community to install the ArgoCD server. Add the following code to the main.tf file to complete the installation declaration:

resource "kubernetes_namespace" "example" {
  metadata {
    name = "argo"
  }
}

resource "helm_release" "argocd" {
  name       = "msur"
  chart      = "argo-cd"
  repository = "https://argoproj.github.io/argo-helm"
  namespace  = "argo"
}

This code creates a namespace for the ArgoCD installation and uses the Helm provider to perform the installation. All that’s left to complete the ArgoCD module is to define some variables.

Variables for Argo CD

Create a file called variables.tf in your ArgoCD module repository and add the following code:

variable "kubernetes_cluster_id" {
  type = string
}

variable "kubernetes_cluster_cert_data" {
  type = string
}

variable "kubernetes_cluster_endpoint" {
  type = string
}

variable "kubernetes_cluster_name" {
  type = string
}

variable "eks_nodegroup_id" {
  type = string
}

We need to define these variables so that we can configure the Kubernetes and Helm providers in our code. That means we’ll need to grab them from the Kubernetes module’s output when we call it in our Sandbox’s Terraform file. Before we get to that step, let’s format and validate the code we’ve written in the same way as we did for our other modules:

module-argocd$ terraform fmt
[...]
module-argocd$ terraform init
[...]
module-argocd$ terraform validate
Success! The configuration is valid.

When you are satisfied that the code is valid, commit your code changes and push them to the Github repository so that we can use the module in our Sandbox environment.

Installing ArgoCD in the Sandbox

We want the ArgoCD installation to happen as part of our Sandbox environment bootstrapping, so we need to call the module from the Terraform definition in our sandbox environment. Add the following code to the end of your sandbox module’s main.tf file to install ArgoCD:

# Add this to the end of the file
module "argo-cd-server" {
  source = "{your_repository_location}/module-argo-cd"

  kubernetes_cluster_id        = module.aws-kubernetes-cluster.eks_cluster_id
  kubernetes_cluster_name      = module.aws-kubernetes-cluster.eks_cluster_name
  kubernetes_cluster_cert_data = module.aws-kubernetes-cluster.eks_cluster_certificate_data
  kubernetes_cluster_endpoint  = module.aws-kubernetes-cluster.eks_cluster_endpoint
  eks_nodegroup_id = module.aws-kubernetes-cluster.eks_cluster_nodegroup_id
}

Now, you can commit and push the Terraform file into your CI/CD pipeline or run it locally with Terraform. When it is done, you’ll have a continuous delivery server that will help to deploy microservices easier and more reliably. With that step completed, we’ve finished defining and provisioning the Sandbox environment. All that’s left is to test it and see if it works.

Testing the Environment

Before we close out our infrastructure implementation, it’s a good idea to run a test and make sure that the environment has been provisioned as expected. We’ll do this by verifying that we can log in to the Argo CD web console. That will prove that the entire stack is running and operational. But, in order to do that, we’ll need to setup our kubectl CLI application.

Earlier in this chapter, when we were creating the Terraform code for our Kubernetes module, we added a local file resource to create a kubeconfig file. Now, we need to download that file so that we can connect to the EKS cluster using the kubectl application.

To retrieve this file, navigate to your sandbox Github repository in your browser and click on the “Actions” tab. You should see a list of builds with your latest run at the top of the screen. Select the build that you just performed and you should see a downloadable artifact called “kubeconfig” that you can click and download.

Once you’ve downloaded kubeconfig, set your KUBECONFIG environment variable to point at it. For example, if the kubeconfig file is in your ~/Downloads directory, use the following value:

$ export KUBECONFIG=~/Downloads/kubeconfig
Tip

If you like, you can copy the kubeconfig file to ~/.kube/config and avoid having to set an environment variable. Just make sure you aren’t overwriting a Kubernetes configuration you’re already using.

You can test that everything is running as expected, by issuing the following command:

$ kubectl get svc

You should see something like the following in response:

TK example default response

That means our network and EKS services were provisioned and we were able to successfully connect to the cluster. As a final test, we’ll check to make sure that ArgoCD has been installed in the cluster. Run the following command and verify that the pods are running:

$ kubectl get pods -n "argocd"
NAME                                                 READY     STATUS    RESTARTS   AGE
msur-argocd-server-c6d4ffcf-9z4c2                    1/1       Running   0          51s
Note

A Kubernetes pod represents a deployable unit and it consists of one or more container images.

We’ll get a chance to use ArgoCD, the Kubernetes cluster and the rest of the infrastructure we’ve designed later in the book. But, now that we know our pipeline and configurations work, it’s time to tear it all down.

Clean Up the Infrastructure

We now have our infrastructure up and running. But, if you aren’t planning on using it right away, it’s a good idea to clean things up so you don’t incur any costs to have it running. In particular, the Elastic IP Addresses that we used for our network can be costly if we leave them up. Since our environment is now completely defined in Terraform declarative files, we can recreate it in the same way whenever we need it, so destroying the existing environment is a low risk activity.

Terraform will automatically destroy resources in the correct order for us because it has internally created a dependency graph. However, one of our resources doesn’t quite fit the Terraform way of doing things - our Istio install. Unfortunately, we’ll need to delete the Istio resources manually before we destroy the Terraform environment. If we miss this step, there is a good chance that Terraform will hang and timeout while trying to destroy the Sandbox network resources.

Run the following command to delete the Istio objects in the Kubernetes cluster

$ istioctl manifest generate --set profile=demo | kubectl --kubeconfig kubeconfig delete -f -

TK Might use this command instead of the one above:

$ kubectl delete namespace istio-system
$ kubectl create namespace isio-system
Tip

If you forget to delete the cluster before you run terraform destroy, you’ll need to manually delete the AWS EC2 Load Balancer in order to destroy the network resources that EKS has automatically created.

Now, to destroy the sandbox environment, navigate to your Sandbox repository’s working directory and run the following command:

sandbox-repository$ terraform destroy

Terraform will display the resources that it will destroy. You’ll need to say yes to continue to the removal process and when it’s done all of the AWS resources that we created will be gone. You can login to the AWS console if you want to confirm that they have been removed.

Summary

We did a lot in this chapter. We created a Terraform module for our software defined network that spanned two availability zones in a single region. Next, we created a module that instantiates an AWS EKS cluster for Kubernets and bootstraps an Istio service mesh installation. We also built a simple Helm based installation module for the ArgoCD server. Finally, we built a sandbox environment that uses all of these modules in a declarative, immutable way.

Along the way we made some important decisions about our environment, as described in the Table 5-3

Table 5-3. Microservices Infrastructure Decision Record
ID Decision Alternatives

INF-001

Public and Private Kubernetes Cluster

Private only, Public Only

INF-002

Managed Kubernetes Service (EKS)

EC2

INF-003

SIEM deferred

The infrastructure decisions we made in this chapter are in alignment with the principles we defined earlier in the book. They are also a product of our need to constrain the scope and breadth of our build so we can fit everything we need to on these pages. Most importantly, we were able to declaratively provision the core parts of a Microservice infrastructure with Terraform and a Github Action based pipeline.

We went into a lot of detail with the Terraform code in this chapter so that you could get a feel for what it takes to define an environment. We also wanted you to get hands on with the Terraform module pattern and some of the design decisions you’ll need to make for your infrastructure. As we learn more about the Microservices we are deploying, we may need additional infrastructure modules, but later in the book we’ll use pre-written, hosed code instead of walking through it all line-by-line.

In the next chapter, we’ll move beyond the infrastructure and get into the design of our Microservices. Later in the book, we’ll get a chance to use this infrastructure design when we release the Microservices that we’ve developed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.225.173