We have already covered the two most crucial phases of the continuous delivery process: the commit phase and automated acceptance testing. We also explained how to cluster your environments for both your application and Jenkins agents. In this chapter, we will focus on configuration management, which connects the virtual containerized environment to the real server infrastructure.
This chapter will cover the following points:
To follow along with the instructions in this chapter, you'll need the following hardware/software:
All the examples and solutions to the exercises can be found on GitHub at https://github.com/PacktPublishing/Continuous-Delivery-With-Docker-and-Jenkins-3rd-Edition/tree/main/Chapter07.
Code in Action videos for this chapter can be viewed at https://bit.ly/3JkcGLE.
Configuration management is the process of controlling configuration changes in such a way that the system maintains integrity over time. Even though the term did not originate in the IT industry, currently, it is broadly used to refer to software and hardware. In this context, it concerns the following aspects:
As an example, we can think of the calculator web service, which uses the Hazelcast server. Let's look at the following diagram, which presents how configuration management works:
The configuration management tool reads the configuration file and prepares the environment. It installs dependent tools and libraries and deploys the applications to multiple instances. Additionally, in the case of cloud deployment, it can provide the necessary infrastructure.
In the preceding example, Infrastructure Configuration specifies the required servers and Server Configuration defines that the Calculator service should be deployed in two instances, on Server 1 and Server 2, and that the Hazelcast service should be installed on Server 3. Calculator Application Configuration specifies the port and the address of the Hazelcast server so that the services can communicate.
Information
The configuration can differ, depending on the type of the environment (QA, staging, or production); for example, server addresses can be different.
There are many approaches to configuration management, but before we look into concrete solutions, let's comment on what characteristics a good configuration management tool should have.
What should a modern configuration management solution look like? Let's walk through the most important factors:
It is important to keep these points in mind while creating the configuration, and even beforehand while choosing the right configuration management tool.
In the classic sense, before the cloud era, configuration management referred to the process that started when all the servers were already in place. So, the starting point was a set of IP addresses with machines accessible via SSH. For that purpose, the most popular configuration management tools are Ansible, Puppet, and Chef. Each of them is a good choice; they are all open source products with free basic versions and paid enterprise editions. The most important differences between them are as follows:
The agentless feature is a significant advantage because it implies no need to install anything on servers. What's more, Ansible is quickly trending upward, which is why it was chosen for this book. Nevertheless, other tools can also be used successfully for the continuous delivery process.
Together with cloud transformation, the meaning of configuration management widened and started to include what is called IaC. As the input, you no longer need a set of IP addresses, but it's enough to provide the credentials to your favorite cloud provider. Then, IaC tools can provision servers for you. What's more, each cloud provider offers a portfolio of services, so in many cases, you don't even need to provision bare-metal servers, but directly use cloud services. While you can still use Ansible, Puppet, or Chef for that purpose, there is a tool called Terraform that is dedicated to the IaC use case.
Let's first describe the classic approach to configuration management with Ansible, and then walk through the IaC solution using Terraform.
Ansible is an open source, agentless automation engine for software provisioning, configuration management, and application deployment. Its first release was in 2012, and its basic version is free for both personal and commercial use. The enterprise version is called Ansible Tower, which provides GUI management and dashboards, the REST API, role-based access control, and some more features.
We will present the installation process and a description of how Ansible can be used separately, as well as in conjunction with Docker.
Ansible uses the SSH protocol for communication and has no special requirements regarding the machine it manages. There is also no central master server, so it's enough to install the Ansible client tool anywhere; we can then use it to manage the whole infrastructure.
Information
The only requirement for the machines being managed is to have the Python tool (and obviously, the SSH server) installed. These tools are, however, almost always available on any server by default.
The installation instructions will differ depending on the operating system. In the case of Ubuntu, it's enough to run the following commands:
$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get update
$ sudo apt-get install ansible
Information
You can find the installation guides for all the operating systems on the official Ansible page, at https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html.
After the installation process is complete, we can execute the ansible command to check that everything was installed successfully:
$ ansible –version
ansible [core 2.12.2]
config file = /etc/ansible/ansible.cfg
...
In order to use Ansible, we first need to define the inventory, which represents the available resources. Then, we will be able to either execute a single command or define a set of tasks using the Ansible playbook.
An inventory is a list of all the servers that are managed by Ansible. Each server requires nothing more than the Python interpreter and the SSH server installed. By default, Ansible assumes that SSH keys are used for authentication; however, it is also possible to use a username and password by adding the --ask-pass option to the Ansible commands.
Tip
SSH keys can be generated with the ssh-keygen tool, and they are usually stored in the ~/.ssh directory.
The inventory is defined by default in the /etc/ansible/hosts file (but its location can be defined with the –i parameter), and it has the following structure:
[group_name]
<server1_address>
<server2_address>
...
Tip
The inventory syntax also accepts ranges of servers, for example, www[01-22].company.com. The SSH port should also be specified if it's anything other than 22 (the default).
There can be many groups in the inventory file. As an example, let's define two machines in one group of servers:
[webservers]
192.168.64.12
192.168.64.13
We can also create the configuration with server aliases and specify the remote user:
[webservers]
web1 ansible_host=192.168.64.12 ansible_user=ubuntu
web2 ansible_host=192.168.64.13 ansible_user=ubuntu
The preceding file defines a group called webservers, which consists of two servers. The Ansible client will log into both of them as the user ubuntu. When we have the inventory created, let's discover how we can use it to execute the same command on many servers.
Information
Ansible offers the possibility to dynamically pull the inventory from a cloud provider (for example, Amazon EC2/Eucalyptus), LDAP, or Cobbler. Read more about dynamic inventories at https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html.
The simplest command we can run is a ping on all servers. Assuming that we have two remote machines (192.168.64.12 and 192.168.64.13) with SSH servers configured and the inventory file (as defined in the last section), let's execute the ping command:
$ ansible all -m ping
web1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"ping": "pong"
}
web2 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"ping": "pong"
}
We used the -m <module_name> option, which allows for specifying the module that should be executed on the remote hosts. The result is successful, which means that the servers are reachable, and the authentication is configured correctly.
Note that we used all, so that all servers would be addressed, but we could also call them by the webservers group name, or by the single host alias. As a second example, let's execute a shell command on only one of the servers:
$ ansible web1 -a "/bin/echo hello"
web1 | CHANGED | rc=0 >>
hello
The -a <arguments> option specifies the arguments that are passed to the Ansible module. In this case, we didn't specify the module, so the arguments are executed as a shell Unix command. The result was successful, and hello was printed.
Tip
If the ansible command is connecting to the server for the first time (or if the server is reinstalled), then we are prompted with the key confirmation message (the SSH message, when the host is not present in known_hosts). Since it may interrupt an automated script, we can disable the prompt message by uncommenting host_key_checking = False in the /etc/ansible/ansible.cfg file, or by setting the environment variable, ANSIBLE_HOST_KEY_CHECKING=False.
In its simplistic form, the Ansible ad hoc command syntax looks as follows:
$ ansible <target> -m <module_name> -a <module_arguments>
The purpose of ad hoc commands is to do something quickly when it is not necessary to repeat it. For example, we may want to check whether a server is alive or power off all the machines for the Christmas break. This mechanism can be seen as a command execution on a group of machines, with the additional syntax simplification provided by the modules. The real power of Ansible automation, however, lies in playbooks.
An Ansible playbook is a configuration file that describes how servers should be configured. It provides a way to define a sequence of tasks that should be performed on each of the machines. A playbook is expressed in the YAML configuration language, which makes it human-readable and easy to understand. Let's start with a sample playbook, and then see how we can use it.
A playbook is composed of one or many plays. Each play contains a host group name, tasks to perform, and configuration details (for example, the remote username or access rights). An example playbook might look like this:
---
- hosts: web1
become: yes
become_method: sudo
tasks:
- name: ensure apache is at the latest version
apt: name=apache2 state=latest
- name: ensure apache is running
service: name=apache2 state=started enabled=yes
This configuration contains one play, which performs the following:
Note that each task has a human-readable name, which is used in the console output, such that apt and service are Ansible modules, and name=apache2, state=latest, and state=started are module arguments. You already saw Ansible modules and arguments while using ad hoc commands. In the preceding playbook, we only defined one play, but there can be many of them, and each can be related to different groups of hosts.
Information
Note that since we used the apt Ansible module, the playbook is dedicated to Debian/Ubuntu servers.
For example, we could define two groups of servers in the inventory: database and webservers. Then, in the playbook, we could specify the tasks that should be executed on all database-hosting machines, and some different tasks that should be executed on all the web servers. By using one command, we could set up the whole environment.
When playbook.yml is defined, we can execute it using the ansible-playbook command:
$ ansible-playbook playbook.yml
PLAY [web1] ***************************************************************
TASK [setup] **************************************************************
ok: [web1]
TASK [ensure apache is at the latest version] *****************************
changed: [web1]
TASK [ensure apache is running] *******************************************
ok: [web1]
PLAY RECAP ****************************************************************
web1: ok=3 changed=1 unreachable=0 failed=0
Tip
If the server requires entering the password for the sudo command, then we need to add the --ask-sudo-pass option to the ansible-playbook command. It's also possible to pass the sudo password (if required) by setting the extra variable, -e ansible_become_pass=<sudo_password>.
The playbook configuration was executed, and therefore, the apache2 tool was installed and started. Note that if the task has changed something on the server, it is marked as changed. On the contrary, if there was no change, the task is marked as ok.
Tip
It is possible to run tasks in parallel by using the -f <num_of_threads> option.
We can execute the command again, as follows:
$ ansible-playbook playbook.yml
PLAY [web1] ***************************************************************
TASK [setup] **************************************************************
ok: [web1]
TASK [ensure apache is at the latest version] *****************************
ok: [web1]
TASK [ensure apache is running] *******************************************
ok: [web1]
PLAY RECAP ****************************************************************
web1: ok=3 changed=0 unreachable=0 failed=0
Note that the output is slightly different. This time, the command didn't change anything on the server. That's because each Ansible module is designed to be idempotent. In other words, executing the same module many times in a sequence should have the same effect as executing it only once.
The simplest way to achieve idempotency is to always check first whether the task has been executed yet, and only execute it if it hasn't. Idempotency is a powerful feature, and we should always write our Ansible tasks this way.
If all the tasks are idempotent, then we can execute them as many times as we want. In that context, we can think of the playbook as a description of the desired state of remote machines. Then, the ansible-playbook command takes care of bringing the machine (or group of machines) into that state.
Some operations should only be executed if some other tasks are changed. For example, imagine that you copy the configuration file to the remote machine and the Apache server should only be restarted if the configuration file has changed. How could we approach such a case?
Ansible provides an event-oriented mechanism to notify about the changes. In order to use it, we need to know two keywords:
Let's look at the following example of how we could copy the configuration to the server and restart Apache only if the configuration has changed:
tasks:
- name: copy configuration
copy:
src: foo.conf
dest: /etc/foo.conf
notify:
- restart apache
handlers:
- name: restart apache
service:
name: apache2
state: restarted
Now, we can create the foo.conf file and run the ansible-playbook command:
$ touch foo.conf
$ ansible-playbook playbook.yml
...
TASK [copy configuration] ************************************************
changed: [web1]
RUNNING HANDLER [restart apache] *****************************************
changed: [web1]
PLAY RECAP ***************************************************************
web1: ok=5 changed=2 unreachable=0 failed=0
Information
Handlers are always executed at the end of the play, and only once, even if triggered by multiple tasks.
Ansible copied the file and restarted the Apache server. It's important to understand that if we run the command again, nothing will happen. However, if we change the content of the foo.conf file and then run the ansible-playbook command, the file will be copied again (and the Apache server will be restarted):
$ echo "something" > foo.conf
$ ansible-playbook playbook.yml
...
TASK [copy configuration] *************************************************
changed: [web1]
RUNNING HANDLER [restart apache] ******************************************
changed: [web1]
PLAY RECAP ****************************************************************
web1: ok=5 changed=2 unreachable=0 failed=0
We used the copy module, which is smart enough to detect whether the file has changed and then make a change on the server.
Tip
There is also a publish-subscribe mechanism in Ansible. Using it means assigning a topic to many handlers. Then, a task notifies the topic to execute all related handlers.
While the Ansible automation makes things identical and repeatable for multiple hosts, it is inevitable that servers may require some differences. For example, think of the application port number. It can be different, depending on the machine. Luckily, Ansible provides variables, which are a good mechanism to deal with server differences. Let's create a new playbook and define a variable:
---
- hosts: web1
vars:
http_port: 8080
The configuration defines the http_port variable with the value 8080. Now, we can use it by using the Jinja2 syntax:
tasks:
- name: print port number
debug:
msg: "Port number: {{ http_port }}"
Tip
The Jinja2 language allows for doing way more than just getting a variable. We can use it to create conditions, loops, and much more. You can find more details on the Jinja page, at https://jinja.palletsprojects.com/.
The debug module prints the message while executing. If we run the ansible-playbook command, we can see the variable usage:
$ ansible-playbook playbook.yml
...
TASK [print port number] **************************************************
ok: [web1] => {
"msg": "Port number: 8080"
}
Apart from user-defined variables, there are also predefined automatic variables. For example, the hostvars variable stores a map with the information regarding all hosts from the inventory. Using the Jinja2 syntax, we can iterate and print the IP addresses of all the hosts in the inventory:
---
- hosts: web1
tasks:
- name: print IP address
debug:
msg: "{% for host in groups['all'] %} {{
hostvars[host]['ansible_host'] }} {% endfor %}"
Then, we can execute the ansible-playbook command:
$ ansible-playbook playbook.yml
...
TASK [print IP address] **************************************************
ok: [web1] => {
"msg": " 192.168.64.12 192.168.64.13 "
}
Note that with the use of the Jinja2 language, we can specify the flow control operations inside the Ansible playbook file.
We can install any tool on the remote server by using Ansible playbooks. Imagine that we would like to have a server with MySQL. We could easily prepare a playbook similar to the one with the apache2 package. However, if you think about it, a server with MySQL is quite a common case, and someone has surely already prepared a playbook for it, so maybe we can just reuse it. This is where Ansible roles and Ansible Galaxy come into play.
An Ansible role is a well-structured playbook part prepared to be included in playbooks. Roles are separate units that always have the following directory structure:
templates/
tasks/
handlers/
vars/
defaults/
meta/
Information
You can read more about roles and what each directory means on the official Ansible page at https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_roles.html.
In each of the directories, we can define the main.yml file, which contains the playbook parts that can be included in the playbook.yml file. Continuing the MySQL case, there is a role defined on GitHub at https://github.com/geerlingguy/ansible-role-mysql. This repository contains task templates that can be used in our playbook. Let's look at a part of the tasks/setup-Debian.yml file, which installs the mysql package in Ubuntu/Debian:
...
- name: Ensure MySQL Python libraries are installed.
apt:
name: "{{ mysql_python_package_debian }}"
state: present
- name: Ensure MySQL packages are installed.
apt:
name: "{{ mysql_packages }}"
state: present
register: deb_mysql_install_packages
...
This is only one of the tasks defined in the tasks/main.yml file. Others tasks are responsible for the installation of MySQL into other operating systems.
If we use this role in order to install MySQL on the server, it's enough to create the following playbook.yml:
---
- hosts: all
become: yes
become_method: sudo
roles:
- role: geerlingguy.mysql
Such a configuration installs the MySQL database on all servers using the geerlingguy.mysql role.
Ansible Galaxy is to Ansible what Docker Hub is to Docker—it stores common roles so that they can be reused by others. You can browse the available roles on the Ansible Galaxy page at https://galaxy.ansible.com/.
To install a role from Ansible Galaxy, we can use the ansible-galaxy command:
$ ansible-galaxy install username.role_name
This command automatically downloads the role. In the case of the MySQL example, we could download the role by executing the following:
$ ansible-galaxy install geerlingguy.mysql
The command downloads the mysql role, which can later be used in the playbook file. If you defined playbook.yml as described in the preceding snippet, the following command installs MySQL into all of your servers:
$ ansible-playbook playbook.yml
Now that you know about the basics of Ansible, let's see how we can use it to deploy our own applications.
We have covered the most fundamental features of Ansible. Now, let's forget, just for a little while, about Docker, Kubernetes, and most of the things we've learned so far. Let's configure a complete deployment step by only using Ansible. We will run the calculator service on one server and the Hazelcast service on the second server.
We can specify a play in the new playbook. Let's create the playbook.yml file, with the following content:
---
- hosts: web1
become: yes
become_method: sudo
tasks:
- name: ensure Java Runtime Environment is installed
apt:
name: default-jre
state: present
update_cache: yes
- name: create Hazelcast directory
file:
path: /var/hazelcast
state: directory
- name: download Hazelcast
get_url:
url: https://repo1.maven.org/maven2/com/hazelcast/hazelcast/5.0.2/hazelcast-5.0.2.jar
dest: /var/hazelcast/hazelcast.jar
mode: a+r
- name: copy Hazelcast starting script
copy:
src: hazelcast.sh
dest: /var/hazelcast/hazelcast.sh
mode: a+x
- name: configure Hazelcast as a service
file:
path: /etc/init.d/hazelcast
state: link
force: yes
src: /var/hazelcast/hazelcast.sh
- name: start Hazelcast
service:
name: hazelcast
enabled: yes
state: started
The configuration is executed on the web1 server and it requires root permissions. It performs a few steps that will lead to a complete Hazelcast server installation. Let's walk through what we defined:
In the same directory, let's create hazelcast.sh, which is a script (shown as follows) that is responsible for running Hazelcast as a Unix service:
#!/bin/bash
### BEGIN INIT INFO
# Provides: hazelcast
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Hazelcast server
### END INIT INFO
java -cp /var/hazelcast/hazelcast.jar com.hazelcast.core.server.HazelcastMemberStarter &
After this step, we could execute the playbook and have Hazelcast started on the web1 server machine. However, let's first create a second play to start the calculator service, and then run it all together.
We prepare the calculator web service in two steps:
Previously, we hardcoded the Hazelcast host address as hazelcast, so now we should change it in the src/main/java/com/leszko/calculator/CalculatorApplication.java file to 192.168.64.12 (the same IP address we have in our inventory, as web1).
Tip
In real-life projects, the application properties are usually kept in the properties file. For example, for the Spring Boot framework, it's a file called application.properties or application.yml. Then, we could change them with Ansible and therefore be more flexible.
Finally, we can add the deployment configuration as a new play in the playbook.yml file. It is similar to the one we created for Hazelcast:
- hosts: web2
become: yes
become_method: sudo
tasks:
- name: ensure Java Runtime Environment is installed
apt:
name: default-jre
state: present
update_cache: yes
- name: create directory for Calculator
file:
path: /var/calculator
state: directory
- name: copy Calculator starting script
copy:
src: calculator.sh
dest: /var/calculator/calculator.sh
mode: a+x
- name: configure Calculator as a service
file:
path: /etc/init.d/calculator
state: link
force: yes
src: /var/calculator/calculator.sh
- name: copy Calculator
copy:
src: build/libs/calculator-0.0.1-SNAPSHOT.jar
dest: /var/calculator/calculator.jar
mode: a+x
notify:
- restart Calculator
handlers:
- name: restart Calculator
service:
name: calculator
enabled: yes
state: restarted
The configuration is very similar to what we saw in the case of Hazelcast. One difference is that this time, we don't download the JAR from the internet, but we copy it from our filesystem. The other difference is that we restart the service using the Ansible handler. That's because we want to restart the calculator each time a new version is copied.
Before we start it all together, we also need to define calculator.sh:
#!/bin/bash
### BEGIN INIT INFO
# Provides: calculator
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Calculator application
### END INIT INFO
java -jar /var/calculator/calculator.jar &
When everything is prepared, we will use this configuration to start the complete system.
As always, we can execute the playbook using the ansible-playbook command. Before that, we need to build the calculator project with Gradle:
$ ./gradlew build
$ ansible-playbook playbook.yml
After the successful deployment, the service should be available, and we can check that it's working at http://192.168.64.13:8080/sum?a=1&b=2 (the IP address should be the same one that we have in our inventory as web2). As expected, it should return 3 as the output.
Note that we have configured the whole environment by executing one command. What's more, if we need to scale the service, then it's enough to add a new server to the inventory and rerun the ansible-playbook command. Also, note that we could package it as an Ansible role and upload it to GitHub, and from then on, everyone could run the same system on their Ubuntu servers. That's the power of Ansible!
We have shown how to use Ansible for environmental configuration and application deployment. The next step is to use Ansible with Docker and Kubernetes.
As you may have noticed, Ansible and Docker (along with Kubernetes) address similar software deployment issues:
If we compare the tools, Docker does a little more, since it provides isolation, portability, and a kind of security. We could even imagine using Docker/Kubernetes without any other configuration management tools. Then, why do we need Ansible at all?
Ansible may seem redundant; however, it brings additional benefits to the delivery process, which are as follows:
We can look at Ansible as the tool that takes care of the infrastructure, while Docker and Kubernetes are tools that take care of the environmental configuration and clustering. An overview is presented in the following diagram:
Ansible manages the infrastructure: Kubernetes clusters, Docker servers, Docker registries, servers without Docker, and cloud providers. It also takes care of the physical location of the servers. Using the inventory host groups, it can link the web services to the databases that are close to their geographic locations.
Let's look at how we can use Ansible to install Docker on a server and deploy a sample application.
Ansible integrates with Docker smoothly, because it provides a set of Docker-dedicated modules. If we create an Ansible playbook for Docker-based deployment, then the first task is to make sure that the Docker Engine is installed on every machine. Then, it should run a container using Docker.
First, let's install Docker on an Ubuntu server.
We can install the Docker Engine by using the following task in the Ansible playbook:
- hosts: web1
become: yes
become_method: sudo
tasks:
- name: Install required packages
apt:
name: "{{ item }}"
state: latest
update_cache: yes
loop:
- apt-transport-https
- ca-certificates
- curl
- software-properties-common
- python3-pip
- virtualenv
- python3-setuptools
- name: Add Docker GPG apt Key
apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present
- name: Add Docker Repository
apt_repository:
repo: deb https://download.docker.com/linux/ubuntu focal stable
state: present
- name: Update apt and install docker-ce
apt:
name: docker-ce
state: latest
update_cache: yes
- name: Install Docker Module for Python
pip:
name: docker
Information
The playbook looks slightly different for each operating system. The one presented here is for Ubuntu 20.04.
This configuration installs Docker and Docker Python tools (needed by Ansible). Note that we used a new Ansible syntax, loop, in order to make the playbook more concise.
When Docker is installed, we can add a task that will run a Docker container.
Running Docker containers is done with the use of the docker_container module, and it looks as follows:
- hosts: web1
become: yes
become_method: sudo
tasks:
- name: run Hazelcast container
community.docker.docker_container:
name: hazelcast
image: hazelcast/hazelcast
state: started
exposed_ports:
- 5701
Information
You can read more about all of the options of the docker_container module at https://docs.ansible.com/ansible/latest/collections/community/docker/docker_container_module.html.
With the two playbooks presented previously, we configured the Hazelcast server using Docker. Note that this is very convenient because we can run the same playbook on multiple (Ubuntu) servers.
Now, let's take a look at how Ansible can help with Kubernetes.
Similar to Docker, Ansible can help with Kubernetes. When you have your Kubernetes cluster configured, then you can create Kubernetes resources using the Ansible k8s module. Here's a sample Ansible task to create a namespace in Kubernetes:
- name: Create namespace
kubernetes.core.k8s:
name: my-namespace
api_version: v1
kind: Namespace
state: present
The configuration here makes sure a namespace called my-namespace is created in the Kubernetes cluster.
Information
You can find more information about the Ansible k8s module at https://docs.ansible.com/ansible/latest/collections/kubernetes/core/k8s_module.html.
We have covered configuration management with Ansible, which is a perfect approach if your deployment environment consists of bare-metal servers. You can also use Ansible with cloud providers, and there are a number of modules dedicated to that purpose. For example, amazon.aws.ec2_instance lets you create and manage AWS EC2 instances. However, when it comes to the cloud, there are better solutions. Let's see what they are and how to use them.
IaC is the process of managing and provisioning computing resources instead of physical hardware configuration. It is mostly associated with the cloud approach, in which you can request the necessary infrastructure in a programmable manner.
Managing computer infrastructure was always a hard, time-consuming, and error-prone activity. You had to manually place the hardware, connect the network, install the operating system, and take care of its updates. Together with the cloud, things became simple; all you had to do was to write a few commands or make a few clicks in the web UI. IaC goes one step further, as it allows you to specify in a declarative manner what infrastructure you need. To understand it better, let's take a look at the following diagram:
You prepare a declarative description of your infrastructure, for example, that you need three servers, a Kubernetes cluster, and a load balancer. Then, you pass this configuration to a tool that uses a cloud-specific API (for example, the AWS API) in order to make sure the infrastructure is as requested. Note that you should store the infrastructure configuration in the source code repository, and you can create multiple identical environments from the same configuration.
You can see that the IaC idea is very similar to configuration management; however, while configuration management makes sure your software is configured as specified, IaC makes sure that your infrastructure is configured as specified.
Now, let's look into the benefits of using IaC.
There are a number of benefits that infrastructure brings into all DevOps activities. Let's walk through the most important ones:
I hope these points have convinced you that IaC is a great approach. Let's now look into the tools you can use for IaC.
When it comes to IaC, there are a number of tools you can use. The choice depends on the cloud provider you use and on your own preferences. Let's walk through the most popular solutions:
Of all the solutions mentioned, Terraform is by far the most popular. That is why we'll spend some more time understanding how it works.
Terraform is an open source tool created and maintained by HashiCorp. It allows you to specify your infrastructure in the form of human-readable configuration files. Similar to Ansible, it works in a declarative manner, which means that you specify the expected outcome, and Terraform makes sure your environment is created as specified.
Before we dive into a concrete example, let's spend a moment understanding how Terraform works.
Terraform reads a configuration file and adjusts the cloud resources accordingly. Let's look at the following diagram, which presents this process:
A user creates Configuration File and starts the Terraform tool. Then, Terraform checks the Terraform State and uses Terraform Provider to translate the declarative configuration file into the requests called against Target API, which is specific for the given cloud provider. As an example, we can think of a configuration file that defines three AWS EC2 instances. Terraform uses the AWS provider, which executes requests to the AWS API to make sure that three AWS EC2 instances are created.
Information
There are more than 1,000 Terraform providers available, and you can browse them via the Terraform Registry at https://registry.terraform.io/.
The Terraform workflow always consists of three stages:
This approach is very convenient because, with the plan stage, we can always check what Terraform is going to change in our infrastructure, before actually applying the change.
Now that we understand the idea behind Terraform, let's look at how it all works in practice, starting from the Terraform installation process.
The installation process depends on the operating system. In the case of Ubuntu, you can execute the following commands:
$ curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
$ sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
$ sudo apt-get update
$ sudo apt-get install terraform
Information
You can find the installation guides for all the operating systems on the official Terraform website, at https://www.terraform.io/downloads.
After the installation process, we can verify that the terraform command works correctly:
$ terraform version
Terraform v1.1.5
After Terraform is configured, we can move to the Terraform example.
As an example, let's use Terraform to provision an AWS EC2 instance. For this purpose, we need to first configure AWS.
To access AWS from your machine, you will need the following:
Information
You can create a free AWS account at https://aws.amazon.com/free. To install the AWS CLI tool, please check the following instructions: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html.
Let's configure the AWS CLI with the following command:
$ aws configure
The AWS command prompts your AWS access key ID and AWS secret access key.
Information
For instructions on how to create an AWS access key pair, please visit https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-creds.
After these steps, access to your AWS account is configured and we can start playing with Terraform.
In a fresh directory, let's create the main.tf file and add the following content:
terraform {
required_version = ">= 1.1" (1)
required_providers {
aws = { (2)
source = "hashicorp/aws"
version = "~> 3.74"
}
}
}
provider "aws" {
profile = "default" (3)
region = "us-east-1" (4)
}
resource "aws_instance" "my_instance" { (5)
ami = "ami-04505e74c0741db8d" (6)
instance_type = "t2.micro" (7)
}
In the preceding configuration, we defined the following parts:
You can see that the whole configuration is declarative. In other words, we define what we want, not the algorithm for how to achieve it.
When the configuration is created, we need to download the required provider from the Terraform Registry.
Let's execute the following command:
$ terraform init
This command downloads all required providers and stores them in the .terraform directory. Now, let's finally apply the Terraform configuration.
Before we make any Terraform changes, it's good to first execute terraform plan to check what changes stand ahead of us:
$ terraform plan
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
...
We can see that by applying the configuration, we will create a resource in our infrastructure as described in the console output.
Let's now apply our configuration:
$ terraform apply
...
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
...
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
After confirming the change, you should see a lot of logs and the last Apply complete! message, which means that our infrastructure is created.
Now, let's verify that everything is as expected.
From the Terraform perspective, we can execute the following command to see the state of our infrastructure:
$ terraform show
# aws_instance.my_instance:
resource "aws_instance" "my_instance" {
...
}
This prints all the information about the resource we created.
Information
Terraform, the same as Ansible, favors idempotent operations. That is why, if we execute terraform plan or terraform apply again, nothing will change. You will only see the following message: No changes. Your infrastructure matches the configuration.
We can now verify that our AWS EC2 instance is really created. Since we already installed the AWS CLI, we can check it with the following command:
$ aws ec2 describe-instances --region us-east-1
{
"Reservations": [
{
"Groups": [],
"Instances": [
{
"AmiLaunchIndex": 0,
"ImageId": "ami-04505e74c0741db8d",
"InstanceId": "i-053b633c810728a97",
"InstanceType": "t2.micro",
...
If you prefer, you can also check in the AWS web console that the instance is created.
We just verified that our Terraform configuration works as expected.
Tip
When working together with Ansible, we can make use of Ansible's dynamic inventories and let Ansible discover created EC2 instances. Read more at https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html.
To make our example complete, let's also see how to delete created resources.
Let's remove the resources we created with the following command:
$ terraform destroy
aws_instance.my_instance: Refreshing state... [id=i-053b633c810728a97]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
- destroy
...
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
...
Destroy complete! Resources: 1 destroyed.
After the user confirmation, Terraform removed all the resources. You can check that our AWS EC2 instance does not exist anymore.
As the last thing with Terraform, let's see how it interacts with Kubernetes.
There are two different use cases when it comes to the interaction between Terraform and Kubernetes:
Let's present them one by one.
Each of the major cloud providers offers managed Kubernetes clusters, and we can provision them using Terraform. The following Terraform providers are available:
Using each of these providers is relatively simple and works similarly to how we described in our Terraform example.
Tip
If you install Kubernetes on bare-metal servers, you should use a configuration management tool, such as Ansible. To provision a cloud-managed Kubernetes cluster, you can use either Ansible or Terraform, but the former is a better fit.
Let's also look at the second usage of Terraform with Kubernetes.
Similar to Ansible, we can use Terraform to interact with a Kubernetes cluster. In other words, instead of applying Kubernetes configurations using the kubectl command, we can use a dedicated Terraform Kubernetes provider.
A sample Terraform configuration to change Kubernetes resources looks as follows:
resource "kubernetes_namespace" "example" {
metadata {
name = "my-first-namespace"
}
}
The preceding configuration creates a namespace called my-namespace in the Kubernetes cluster.
Tip
There are multiple ways you can interact with a Kubernetes cluster: kubectl, Ansible, Terraform, or some other tool. As a rule of thumb, I would always first try the simplest approach, which is the kubectl command, and only incorporate Ansible or Terraform if you have some special requirements; for example, you manage multiple Kubernetes clusters at the same time.
We covered the basics of Terraform, so let's wrap up this chapter with a short summary.
We have covered configuration management and IaC approaches, together with the related tooling. Note that whether you should use Ansible, Terraform, or neither of them inside your continuous delivery pipeline highly depends on your particular use case.
Ansible shines when you have multiple bare-metal servers to manage, so if your release means making the same change into many servers at the same time, you'll most probably place Ansible commands inside your pipeline.
Terraform works best when you use the cloud. Therefore, if your release means making a change to your cloud infrastructure, then Terraform is the way to go.
However, if your environment is only a single Kubernetes cluster, then there is nothing wrong with executing kubectl commands inside your pipeline.
The other takeaway points from this chapter are as follows:
In the next chapter, we will wrap up the continuous delivery process and complete the final Jenkins pipeline.
In this chapter, we covered the fundamentals of Ansible and ways to use it with Docker and Kubernetes. As exercises, try the following tasks:
To verify your knowledge from this chapter, please answer the following questions:
To read more about configuration management and IaC, please refer to the following resources:
3.141.164.243