Chapter 7: Configuration Management with Ansible

We have already covered the two most crucial phases of the continuous delivery process: the commit phase and automated acceptance testing. We also explained how to cluster your environments for both your application and Jenkins agents. In this chapter, we will focus on configuration management, which connects the virtual containerized environment to the real server infrastructure.

This chapter will cover the following points:

  • Introducing configuration management
  • Installing Ansible
  • Using Ansible
  • Deployment with Ansible
  • Ansible with Docker and Kubernetes
  • Introducing infrastructure as code
  • Introducing Terraform

Technical requirements

To follow along with the instructions in this chapter, you'll need the following hardware/software:

  • Java 8+
  • Python
  • Remote machines with the Ubuntu operating system and SSH server installed
  • An AWS account

All the examples and solutions to the exercises can be found on GitHub at https://github.com/PacktPublishing/Continuous-Delivery-With-Docker-and-Jenkins-3rd-Edition/tree/main/Chapter07.

Code in Action videos for this chapter can be viewed at https://bit.ly/3JkcGLE.

Introducing configuration management

Configuration management is the process of controlling configuration changes in such a way that the system maintains integrity over time. Even though the term did not originate in the IT industry, currently, it is broadly used to refer to software and hardware. In this context, it concerns the following aspects:

  • Application configuration: This involves software properties that decide how the system works, which are usually expressed in the form of flags or properties files passed to the application, for example, the database address, the maximum chunk size for file processing, or the logging level. They can be applied during different development phases: build, package, deploy, or run.
  • Server configuration: This defines what dependencies should be installed on each server and specifies the way applications are orchestrated (which application is run on which server, and in how many instances).
  • Infrastructure configuration: This involves server infrastructure and environment configuration. If you use on-premises servers, then this part is related to the manual hardware and network installation; if you use cloud solutions, then this part can be automated with the infrastructure as code (IaC)e approach.

As an example, we can think of the calculator web service, which uses the Hazelcast server. Let's look at the following diagram, which presents how configuration management works:

Figure 7.1 – Sample configuration management

Figure 7.1 – Sample configuration management

The configuration management tool reads the configuration file and prepares the environment. It installs dependent tools and libraries and deploys the applications to multiple instances. Additionally, in the case of cloud deployment, it can provide the necessary infrastructure.

In the preceding example, Infrastructure Configuration specifies the required servers and Server Configuration defines that the Calculator service should be deployed in two instances, on Server 1 and Server 2, and that the Hazelcast service should be installed on Server 3. Calculator Application Configuration specifies the port and the address of the Hazelcast server so that the services can communicate.

Information

The configuration can differ, depending on the type of the environment (QA, staging, or production); for example, server addresses can be different.

There are many approaches to configuration management, but before we look into concrete solutions, let's comment on what characteristics a good configuration management tool should have.

Traits of good configuration management

What should a modern configuration management solution look like? Let's walk through the most important factors:

  • Automation: Each environment should be automatically reproducible, including the operating system, the network configuration, the software installed, and the applications deployed. In such an approach, fixing production issues means nothing more than an automatic rebuild of the environment. What's more, it simplifies server replications and ensures that the staging and production environments are exactly the same.
  • Version control: Every change in the configuration should be tracked, so that we know who made it, why, and when. Usually, that means keeping the configuration in the source code repository, either with the code or in a separate place. The former solution is recommended because configuration properties have a different life cycle than the application itself. Version control also helps with fixing production issues; the configuration can always be rolled back to the previous version, and the environment automatically rebuilt. The only exception to the version control-based solution is storing credentials and other sensitive information; these should never be checked in.
  • Incremental changes: Applying a change in the configuration should not require rebuilding the whole environment. On the contrary, a small change in the configuration should only change the related part of the infrastructure.
  • Server provisioning: Thanks to automation, adding a new server should be as quick as adding its address to the configuration (and executing one command).
  • Security: The access to both the configuration management tool and the machines under its control should be well secured. When using the SSH protocol for communication, the access to the keys or credentials needs to be well protected.
  • Simplicity: Every member of the team should be able to read the configuration, make a change, and apply it to the environment. The properties themselves should also be kept as simple as possible, and the ones that are not subject to change are better off kept hardcoded.

It is important to keep these points in mind while creating the configuration, and even beforehand while choosing the right configuration management tool.

Overview of configuration management tools

In the classic sense, before the cloud era, configuration management referred to the process that started when all the servers were already in place. So, the starting point was a set of IP addresses with machines accessible via SSH. For that purpose, the most popular configuration management tools are Ansible, Puppet, and Chef. Each of them is a good choice; they are all open source products with free basic versions and paid enterprise editions. The most important differences between them are as follows:

  • Configuration language: Chef uses Ruby, Puppet uses its own DSL (based on Ruby), and Ansible uses YAML.
  • Agent-based: Puppet and Chef use agents for communication, which means that each managed server needs to have a special tool installed. Ansible, on the other hand, is agentless and uses the standard SSH protocol for communication.

The agentless feature is a significant advantage because it implies no need to install anything on servers. What's more, Ansible is quickly trending upward, which is why it was chosen for this book. Nevertheless, other tools can also be used successfully for the continuous delivery process.

Together with cloud transformation, the meaning of configuration management widened and started to include what is called IaC. As the input, you no longer need a set of IP addresses, but it's enough to provide the credentials to your favorite cloud provider. Then, IaC tools can provision servers for you. What's more, each cloud provider offers a portfolio of services, so in many cases, you don't even need to provision bare-metal servers, but directly use cloud services. While you can still use Ansible, Puppet, or Chef for that purpose, there is a tool called Terraform that is dedicated to the IaC use case.

Let's first describe the classic approach to configuration management with Ansible, and then walk through the IaC solution using Terraform.

Installing Ansible

Ansible is an open source, agentless automation engine for software provisioning, configuration management, and application deployment. Its first release was in 2012, and its basic version is free for both personal and commercial use. The enterprise version is called Ansible Tower, which provides GUI management and dashboards, the REST API, role-based access control, and some more features.

We will present the installation process and a description of how Ansible can be used separately, as well as in conjunction with Docker.

Ansible server requirements

Ansible uses the SSH protocol for communication and has no special requirements regarding the machine it manages. There is also no central master server, so it's enough to install the Ansible client tool anywhere; we can then use it to manage the whole infrastructure.

Information

The only requirement for the machines being managed is to have the Python tool (and obviously, the SSH server) installed. These tools are, however, almost always available on any server by default.

Ansible installation

The installation instructions will differ depending on the operating system. In the case of Ubuntu, it's enough to run the following commands:

$ sudo apt-get install software-properties-common

$ sudo apt-add-repository ppa:ansible/ansible

$ sudo apt-get update

$ sudo apt-get install ansible

Information

You can find the installation guides for all the operating systems on the official Ansible page, at https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html.

After the installation process is complete, we can execute the ansible command to check that everything was installed successfully:

$ ansible –version

ansible [core 2.12.2]

  config file = /etc/ansible/ansible.cfg

...

Using Ansible

In order to use Ansible, we first need to define the inventory, which represents the available resources. Then, we will be able to either execute a single command or define a set of tasks using the Ansible playbook.

Creating an inventory

An inventory is a list of all the servers that are managed by Ansible. Each server requires nothing more than the Python interpreter and the SSH server installed. By default, Ansible assumes that SSH keys are used for authentication; however, it is also possible to use a username and password by adding the --ask-pass option to the Ansible commands.

Tip

SSH keys can be generated with the ssh-keygen tool, and they are usually stored in the ~/.ssh directory.

The inventory is defined by default in the /etc/ansible/hosts file (but its location can be defined with the –i parameter), and it has the following structure:

[group_name]

<server1_address>

<server2_address>

...

Tip

The inventory syntax also accepts ranges of servers, for example, www[01-22].company.com. The SSH port should also be specified if it's anything other than 22 (the default).

There can be many groups in the inventory file. As an example, let's define two machines in one group of servers:

[webservers]

192.168.64.12

192.168.64.13

We can also create the configuration with server aliases and specify the remote user:

[webservers]

web1 ansible_host=192.168.64.12 ansible_user=ubuntu

web2 ansible_host=192.168.64.13 ansible_user=ubuntu

The preceding file defines a group called webservers, which consists of two servers. The Ansible client will log into both of them as the user ubuntu. When we have the inventory created, let's discover how we can use it to execute the same command on many servers.

Information

Ansible offers the possibility to dynamically pull the inventory from a cloud provider (for example, Amazon EC2/Eucalyptus), LDAP, or Cobbler. Read more about dynamic inventories at https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html.

Ad hoc commands

The simplest command we can run is a ping on all servers. Assuming that we have two remote machines (192.168.64.12 and 192.168.64.13) with SSH servers configured and the inventory file (as defined in the last section), let's execute the ping command:

$ ansible all -m ping

web1 | SUCCESS => {

    "ansible_facts": {

        "discovered_interpreter_python": "/usr/bin/python3"

    },

    "changed": false,

    "ping": "pong"

}

web2 | SUCCESS => {

    "ansible_facts": {

        "discovered_interpreter_python": "/usr/bin/python3"

    },

    "changed": false,

    "ping": "pong"

}

We used the -m <module_name> option, which allows for specifying the module that should be executed on the remote hosts. The result is successful, which means that the servers are reachable, and the authentication is configured correctly.

Note that we used all, so that all servers would be addressed, but we could also call them by the webservers group name, or by the single host alias. As a second example, let's execute a shell command on only one of the servers:

$ ansible web1 -a "/bin/echo hello"

web1 | CHANGED | rc=0 >>

hello

The -a <arguments> option specifies the arguments that are passed to the Ansible module. In this case, we didn't specify the module, so the arguments are executed as a shell Unix command. The result was successful, and hello was printed.

Tip

If the ansible command is connecting to the server for the first time (or if the server is reinstalled), then we are prompted with the key confirmation message (the SSH message, when the host is not present in known_hosts). Since it may interrupt an automated script, we can disable the prompt message by uncommenting host_key_checking = False in the /etc/ansible/ansible.cfg file, or by setting the environment variable, ANSIBLE_HOST_KEY_CHECKING=False.

In its simplistic form, the Ansible ad hoc command syntax looks as follows:

$ ansible <target> -m <module_name> -a <module_arguments>

The purpose of ad hoc commands is to do something quickly when it is not necessary to repeat it. For example, we may want to check whether a server is alive or power off all the machines for the Christmas break. This mechanism can be seen as a command execution on a group of machines, with the additional syntax simplification provided by the modules. The real power of Ansible automation, however, lies in playbooks.

Playbooks

An Ansible playbook is a configuration file that describes how servers should be configured. It provides a way to define a sequence of tasks that should be performed on each of the machines. A playbook is expressed in the YAML configuration language, which makes it human-readable and easy to understand. Let's start with a sample playbook, and then see how we can use it.

Defining a playbook

A playbook is composed of one or many plays. Each play contains a host group name, tasks to perform, and configuration details (for example, the remote username or access rights). An example playbook might look like this:

---

- hosts: web1

  become: yes

  become_method: sudo

  tasks:

  - name: ensure apache is at the latest version

    apt: name=apache2 state=latest

  - name: ensure apache is running

    service: name=apache2 state=started enabled=yes

This configuration contains one play, which performs the following:

  • Only executes on the web1 host
  • Gains root access using the sudo command
  • Executes two tasks:
    • Installing the latest version of apache2: The apt Ansible module (called with two parameters, name=apache2 and state=latest) checks whether the apache2 package is installed on the server, and if it isn't, it uses the apt-get tool to install it.
    • Running the Apache2 service: The service Ansible module (called with three parameters, name=apache2state=started, and enabled=yes) checks whether the apache2 Unix service is started, and if it isn't, it uses the service command to start it.

Note that each task has a human-readable name, which is used in the console output, such that apt and service are Ansible modules, and name=apache2state=latest, and state=started are module arguments. You already saw Ansible modules and arguments while using ad hoc commands. In the preceding playbook, we only defined one play, but there can be many of them, and each can be related to different groups of hosts.

Information

Note that since we used the apt Ansible module, the playbook is dedicated to Debian/Ubuntu servers.

For example, we could define two groups of servers in the inventory: database and webservers. Then, in the playbook, we could specify the tasks that should be executed on all database-hosting machines, and some different tasks that should be executed on all the web servers. By using one command, we could set up the whole environment.

Executing the playbook

When playbook.yml is defined, we can execute it using the ansible-playbook command:

$ ansible-playbook playbook.yml

PLAY [web1] ***************************************************************

TASK [setup] **************************************************************

ok: [web1]

TASK [ensure apache is at the latest version] *****************************

changed: [web1]

TASK [ensure apache is running] *******************************************

ok: [web1]

PLAY RECAP ****************************************************************

web1: ok=3 changed=1 unreachable=0 failed=0

Tip

If the server requires entering the password for the sudo command, then we need to add the --ask-sudo-pass option to the ansible-playbook command. It's also possible to pass the sudo password (if required) by setting the extra variable, -e ansible_become_pass=<sudo_password>.

The playbook configuration was executed, and therefore, the apache2 tool was installed and started. Note that if the task has changed something on the server, it is marked as changed. On the contrary, if there was no change, the task is marked as ok.

Tip

It is possible to run tasks in parallel by using the -f <num_of_threads> option.

The playbook's idempotency

We can execute the command again, as follows:

$ ansible-playbook playbook.yml

PLAY [web1] ***************************************************************

TASK [setup] **************************************************************

ok: [web1]

TASK [ensure apache is at the latest version] *****************************

ok: [web1]

TASK [ensure apache is running] *******************************************

ok: [web1]

PLAY RECAP ****************************************************************

web1: ok=3 changed=0 unreachable=0 failed=0

Note that the output is slightly different. This time, the command didn't change anything on the server. That's because each Ansible module is designed to be idempotent. In other words, executing the same module many times in a sequence should have the same effect as executing it only once.

The simplest way to achieve idempotency is to always check first whether the task has been executed yet, and only execute it if it hasn't. Idempotency is a powerful feature, and we should always write our Ansible tasks this way.

If all the tasks are idempotent, then we can execute them as many times as we want. In that context, we can think of the playbook as a description of the desired state of remote machines. Then, the ansible-playbook command takes care of bringing the machine (or group of machines) into that state.

Handlers

Some operations should only be executed if some other tasks are changed. For example, imagine that you copy the configuration file to the remote machine and the Apache server should only be restarted if the configuration file has changed. How could we approach such a case?

Ansible provides an event-oriented mechanism to notify about the changes. In order to use it, we need to know two keywords:

  • handlers: This specifies the tasks executed when notified.
  • notify: This specifies the handlers that should be executed.

Let's look at the following example of how we could copy the configuration to the server and restart Apache only if the configuration has changed:

tasks:

- name: copy configuration

  copy:

    src: foo.conf

    dest: /etc/foo.conf

  notify:

  - restart apache

handlers:

- name: restart apache

  service:

    name: apache2

    state: restarted

Now, we can create the foo.conf file and run the ansible-playbook command:

$ touch foo.conf

$ ansible-playbook playbook.yml

...

TASK [copy configuration] ************************************************

changed: [web1]

RUNNING HANDLER [restart apache] *****************************************

changed: [web1]

PLAY RECAP ***************************************************************

web1: ok=5 changed=2 unreachable=0 failed=0   

Information

Handlers are always executed at the end of the play, and only once, even if triggered by multiple tasks.

Ansible copied the file and restarted the Apache server. It's important to understand that if we run the command again, nothing will happen. However, if we change the content of the foo.conf file and then run the ansible-playbook command, the file will be copied again (and the Apache server will be restarted):

$ echo "something" > foo.conf

$ ansible-playbook playbook.yml

...

TASK [copy configuration] *************************************************

changed: [web1]

RUNNING HANDLER [restart apache] ******************************************

changed: [web1]

PLAY RECAP ****************************************************************

web1: ok=5 changed=2 unreachable=0 failed=0   

We used the copy module, which is smart enough to detect whether the file has changed and then make a change on the server.

Tip

There is also a publish-subscribe mechanism in Ansible. Using it means assigning a topic to many handlers. Then, a task notifies the topic to execute all related handlers.

Variables

While the Ansible automation makes things identical and repeatable for multiple hosts, it is inevitable that servers may require some differences. For example, think of the application port number. It can be different, depending on the machine. Luckily, Ansible provides variables, which are a good mechanism to deal with server differences. Let's create a new playbook and define a variable:

---

- hosts: web1

  vars:

    http_port: 8080

The configuration defines the http_port variable with the value 8080. Now, we can use it by using the Jinja2 syntax:

tasks:

- name: print port number

  debug:

    msg: "Port number: {{ http_port }}"

Tip

The Jinja2 language allows for doing way more than just getting a variable. We can use it to create conditions, loops, and much more. You can find more details on the Jinja page, at https://jinja.palletsprojects.com/.

The debug module prints the message while executing. If we run the ansible-playbook command, we can see the variable usage:

$ ansible-playbook playbook.yml

...

TASK [print port number] **************************************************

ok: [web1] => {

      "msg": "Port number: 8080"

}  

Apart from user-defined variables, there are also predefined automatic variables. For example, the hostvars variable stores a map with the information regarding all hosts from the inventory. Using the Jinja2 syntax, we can iterate and print the IP addresses of all the hosts in the inventory:

---

- hosts: web1

  tasks:

  - name: print IP address

    debug:

      msg: "{% for host in groups['all'] %} {{

              hostvars[host]['ansible_host'] }} {% endfor %}"

Then, we can execute the ansible-playbook command:

$ ansible-playbook playbook.yml

...

TASK [print IP address] **************************************************

ok: [web1] => {

      "msg": " 192.168.64.12  192.168.64.13 "

}

Note that with the use of the Jinja2 language, we can specify the flow control operations inside the Ansible playbook file.

Roles

We can install any tool on the remote server by using Ansible playbooks. Imagine that we would like to have a server with MySQL. We could easily prepare a playbook similar to the one with the apache2 package. However, if you think about it, a server with MySQL is quite a common case, and someone has surely already prepared a playbook for it, so maybe we can just reuse it. This is where Ansible roles and Ansible Galaxy come into play.

Understanding roles

An Ansible role is a well-structured playbook part prepared to be included in playbooks. Roles are separate units that always have the following directory structure:

templates/

tasks/

handlers/

vars/

defaults/

meta/

Information

You can read more about roles and what each directory means on the official Ansible page at https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_roles.html.

In each of the directories, we can define the main.yml file, which contains the playbook parts that can be included in the playbook.yml file. Continuing the MySQL case, there is a role defined on GitHub at https://github.com/geerlingguy/ansible-role-mysql. This repository contains task templates that can be used in our playbook. Let's look at a part of the tasks/setup-Debian.yml file, which installs the mysql package in Ubuntu/Debian:

...

- name: Ensure MySQL Python libraries are installed.

  apt:

    name: "{{ mysql_python_package_debian }}"

    state: present

- name: Ensure MySQL packages are installed.

  apt:

    name: "{{ mysql_packages }}"

    state: present

  register: deb_mysql_install_packages

...

This is only one of the tasks defined in the tasks/main.yml file. Others tasks are responsible for the installation of MySQL into other operating systems.

If we use this role in order to install MySQL on the server, it's enough to create the following playbook.yml:

---

- hosts: all

  become: yes

  become_method: sudo

  roles:

  - role: geerlingguy.mysql

Such a configuration installs the MySQL database on all servers using the geerlingguy.mysql role.

Ansible Galaxy

Ansible Galaxy is to Ansible what Docker Hub is to Docker—it stores common roles so that they can be reused by others. You can browse the available roles on the Ansible Galaxy page at https://galaxy.ansible.com/.

To install a role from Ansible Galaxy, we can use the ansible-galaxy command:

$ ansible-galaxy install username.role_name

This command automatically downloads the role. In the case of the MySQL example, we could download the role by executing the following:

$ ansible-galaxy install geerlingguy.mysql

The command downloads the mysql role, which can later be used in the playbook file. If you defined playbook.yml as described in the preceding snippet, the following command installs MySQL into all of your servers:

$ ansible-playbook playbook.yml

Now that you know about the basics of Ansible, let's see how we can use it to deploy our own applications.

Deployment with Ansible

We have covered the most fundamental features of Ansible. Now, let's forget, just for a little while, about Docker, Kubernetes, and most of the things we've learned so far. Let's configure a complete deployment step by only using Ansible. We will run the calculator service on one server and the Hazelcast service on the second server.

Installing Hazelcast

We can specify a play in the new playbook. Let's create the playbook.yml file, with the following content:

---

- hosts: web1

  become: yes

  become_method: sudo

  tasks:

  - name: ensure Java Runtime Environment is installed

    apt:

      name: default-jre

      state: present

      update_cache: yes

  - name: create Hazelcast directory

    file:

      path: /var/hazelcast

      state: directory

  - name: download Hazelcast

    get_url:

      url: https://repo1.maven.org/maven2/com/hazelcast/hazelcast/5.0.2/hazelcast-5.0.2.jar

      dest: /var/hazelcast/hazelcast.jar

      mode: a+r

  - name: copy Hazelcast starting script

    copy:

      src: hazelcast.sh

      dest: /var/hazelcast/hazelcast.sh

      mode: a+x

  - name: configure Hazelcast as a service

    file:

      path: /etc/init.d/hazelcast

      state: link

      force: yes

      src: /var/hazelcast/hazelcast.sh

  - name: start Hazelcast

    service:

      name: hazelcast

      enabled: yes

      state: started

The configuration is executed on the web1 server and it requires root permissions. It performs a few steps that will lead to a complete Hazelcast server installation. Let's walk through what we defined:

  1. Prepare the environment: This task ensures that the Java runtime environment is installed. Basically, it prepares the server environment so that Hazelcast will have all the necessary dependencies. With more complex applications, the list of dependent tools and libraries can be way longer.
  2. Download Hazelcast tool: Hazelcast is provided in the form of a JAR, which can be downloaded from the internet. We hardcoded the version, but in a real-life scenario, it would be better to extract it to a variable.
  3. Configure application as a service: We would like to have Hazelcast running as a Unix service so that it would be manageable in the standard way. In this case, it's enough to copy a service script and link it in the /etc/init.d/ directory.
  4. Start the Hazelcast service: When Hazelcast is configured as a Unix service, we can start it in the standard way.

In the same directory, let's create hazelcast.sh, which is a script (shown as follows) that is responsible for running Hazelcast as a Unix service:

#!/bin/bash

### BEGIN INIT INFO

# Provides: hazelcast

# Required-Start: $remote_fs $syslog

# Required-Stop: $remote_fs $syslog

# Default-Start: 2 3 4 5

# Default-Stop: 0 1 6

# Short-Description: Hazelcast server

### END INIT INFO

java -cp /var/hazelcast/hazelcast.jar com.hazelcast.core.server.HazelcastMemberStarter &

After this step, we could execute the playbook and have Hazelcast started on the web1 server machine. However, let's first create a second play to start the calculator service, and then run it all together.

Deploying a web service

We prepare the calculator web service in two steps:

  1. Change the Hazelcast host address.
  2. Add calculator deployment to the playbook.

Changing the Hazelcast host address

Previously, we hardcoded the Hazelcast host address as hazelcast, so now we should change it in the src/main/java/com/leszko/calculator/CalculatorApplication.java file to 192.168.64.12 (the same IP address we have in our inventory, as web1).

Tip

In real-life projects, the application properties are usually kept in the properties file. For example, for the Spring Boot framework, it's a file called application.properties or application.yml. Then, we could change them with Ansible and therefore be more flexible.

Adding calculator deployment to the playbook

Finally, we can add the deployment configuration as a new play in the playbook.yml file. It is similar to the one we created for Hazelcast:

- hosts: web2

  become: yes

  become_method: sudo

  tasks:

  - name: ensure Java Runtime Environment is installed

    apt:

      name: default-jre

      state: present

      update_cache: yes

  - name: create directory for Calculator

    file:

      path: /var/calculator

      state: directory

  - name: copy Calculator starting script

    copy:

      src: calculator.sh

      dest: /var/calculator/calculator.sh

      mode: a+x

  - name: configure Calculator as a service

    file:

      path: /etc/init.d/calculator

      state: link

      force: yes

      src: /var/calculator/calculator.sh

  - name: copy Calculator

    copy:

      src: build/libs/calculator-0.0.1-SNAPSHOT.jar

      dest: /var/calculator/calculator.jar

      mode: a+x

    notify:

    - restart Calculator

  handlers:

  - name: restart Calculator

    service:

      name: calculator

      enabled: yes

      state: restarted

The configuration is very similar to what we saw in the case of Hazelcast. One difference is that this time, we don't download the JAR from the internet, but we copy it from our filesystem. The other difference is that we restart the service using the Ansible handler. That's because we want to restart the calculator each time a new version is copied.

Before we start it all together, we also need to define calculator.sh:

#!/bin/bash

### BEGIN INIT INFO

# Provides: calculator

# Required-Start: $remote_fs $syslog

# Required-Stop: $remote_fs $syslog

# Default-Start: 2 3 4 5

# Default-Stop: 0 1 6

# Short-Description: Calculator application

### END INIT INFO

java -jar /var/calculator/calculator.jar &

When everything is prepared, we will use this configuration to start the complete system.

Running the deployment

As always, we can execute the playbook using the ansible-playbook command. Before that, we need to build the calculator project with Gradle:

$ ./gradlew build

$ ansible-playbook playbook.yml

After the successful deployment, the service should be available, and we can check that it's working at http://192.168.64.13:8080/sum?a=1&b=2 (the IP address should be the same one that we have in our inventory as web2). As expected, it should return 3 as the output.

Note that we have configured the whole environment by executing one command. What's more, if we need to scale the service, then it's enough to add a new server to the inventory and rerun the ansible-playbook command. Also, note that we could package it as an Ansible role and upload it to GitHub, and from then on, everyone could run the same system on their Ubuntu servers. That's the power of Ansible!

We have shown how to use Ansible for environmental configuration and application deployment. The next step is to use Ansible with Docker and Kubernetes.

Ansible with Docker and Kubernetes

As you may have noticed, Ansible and Docker (along with Kubernetes) address similar software deployment issues:

  • Environmental configuration: Both Ansible and Docker provide a way to configure the environment; however, they use different means. While Ansible uses scripts (encapsulated inside the Ansible modules), Docker encapsulates the whole environment inside a container.
  • Dependencies: Ansible provides a way to deploy different services on the same or different hosts and lets them be deployed together. Kubernetes has similar functionality, which allows for running multiple containers at the same time.
  • Scalability: Ansible helps to scale the services providing the inventory and host groups. Kubernetes has similar functionality to automatically increase or decrease the number of running containers.
  • Automation with configuration files: Docker, Kubernetes, and Ansible store the whole environmental configuration and service dependencies in files (stored in the source control repository). For Ansible, this file is called playbook.yml. In the case of Docker and Kubernetes, we have Dockerfile for the environment and deployment.yml for the dependencies and scaling.
  • Simplicity: Both tools are very simple to use and provide a way to set up the whole running environment with a configuration file and just one command execution.

If we compare the tools, Docker does a little more, since it provides isolation, portability, and a kind of security. We could even imagine using Docker/Kubernetes without any other configuration management tools. Then, why do we need Ansible at all?

Benefits of Ansible

Ansible may seem redundant; however, it brings additional benefits to the delivery process, which are as follows:

  • Docker environment: The Docker/Kubernetes hosts themselves have to be configured and managed. Every container is ultimately running on Linux machines, which need kernel patching, Docker Engine updates, and network configuration, for example. What's more, there may be different server machines with different Linux distributions, and the responsibility of Ansible is to make sure everything is up and running.
  • Non-Dockerized applications: Not everything is run inside a container. If part of the infrastructure is containerized and part is deployed in the standard way or in the cloud, then Ansible can manage it all with the playbook configuration file. There may be different reasons for not running an application as a container; for example, performance, security, specific hardware requirements, or working with the legacy software.
  • Inventory: Ansible offers a very friendly way to manage the physical infrastructure by using inventories, which store information about all the servers. It can also split the physical infrastructure into different environments—production, testing, and development.
  • Cloud provisioning: Ansible can be responsible for provisioning Kubernetes clusters or installing Kubernetes in the cloud; for example, we can imagine integration tests in which the first step is to create a Kubernetes cluster on Google Cloud Platform (GCP) (only then can we deploy the whole application and perform the testing process).
  • GUI: Ansible offers GUI managers (commercial Ansible Tower and open source AWX), which aim to improve the experience of infrastructure management.
  • Improving the testing process: Ansible can help with integration and acceptance testing, as it can encapsulate testing scripts.

We can look at Ansible as the tool that takes care of the infrastructure, while Docker and Kubernetes are tools that take care of the environmental configuration and clustering. An overview is presented in the following diagram:

Figure 7.2 – Ansible as the infrastructure manager

Figure 7.2 – Ansible as the infrastructure manager

Ansible manages the infrastructure: Kubernetes clustersDocker servers, Docker registries, servers without Docker, and cloud providers. It also takes care of the physical location of the servers. Using the inventory host groups, it can link the web services to the databases that are close to their geographic locations.

Let's look at how we can use Ansible to install Docker on a server and deploy a sample application.

The Ansible Docker playbook

Ansible integrates with Docker smoothly, because it provides a set of Docker-dedicated modules. If we create an Ansible playbook for Docker-based deployment, then the first task is to make sure that the Docker Engine is installed on every machine. Then, it should run a container using Docker.

First, let's install Docker on an Ubuntu server.

Installing Docker

We can install the Docker Engine by using the following task in the Ansible playbook:

- hosts: web1

  become: yes

  become_method: sudo

  tasks:

  - name: Install required packages

    apt:

      name: "{{ item }}"

      state: latest

      update_cache: yes

    loop:

    - apt-transport-https

    - ca-certificates

    - curl

    - software-properties-common

    - python3-pip

    - virtualenv

    - python3-setuptools

  - name: Add Docker GPG apt Key

    apt_key:

      url: https://download.docker.com/linux/ubuntu/gpg

      state: present

  - name: Add Docker Repository

    apt_repository:

      repo: deb https://download.docker.com/linux/ubuntu focal stable

      state: present

  - name: Update apt and install docker-ce

    apt:

      name: docker-ce

      state: latest

      update_cache: yes

  - name: Install Docker Module for Python

    pip:

      name: docker

Information

The playbook looks slightly different for each operating system. The one presented here is for Ubuntu 20.04.

This configuration installs Docker and Docker Python tools (needed by Ansible). Note that we used a new Ansible syntax, loop, in order to make the playbook more concise.

When Docker is installed, we can add a task that will run a Docker container.

Running Docker containers

Running Docker containers is done with the use of the docker_container module, and it looks as follows:

- hosts: web1

  become: yes

  become_method: sudo

  tasks:

  - name: run Hazelcast container

    community.docker.docker_container:

      name: hazelcast

      image: hazelcast/hazelcast

      state: started

      exposed_ports:

      - 5701

Information

You can read more about all of the options of the docker_container module at https://docs.ansible.com/ansible/latest/collections/community/docker/docker_container_module.html.

With the two playbooks presented previously, we configured the Hazelcast server using Docker. Note that this is very convenient because we can run the same playbook on multiple (Ubuntu) servers.

Now, let's take a look at how Ansible can help with Kubernetes.

The Ansible Kubernetes playbook

Similar to Docker, Ansible can help with Kubernetes. When you have your Kubernetes cluster configured, then you can create Kubernetes resources using the Ansible k8s module. Here's a sample Ansible task to create a namespace in Kubernetes:

- name: Create namespace

  kubernetes.core.k8s:

    name: my-namespace

    api_version: v1

    kind: Namespace

    state: present

The configuration here makes sure a namespace called my-namespace is created in the Kubernetes cluster.

Information

You can find more information about the Ansible k8s module at https://docs.ansible.com/ansible/latest/collections/kubernetes/core/k8s_module.html.

We have covered configuration management with Ansible, which is a perfect approach if your deployment environment consists of bare-metal servers. You can also use Ansible with cloud providers, and there are a number of modules dedicated to that purpose. For example, amazon.aws.ec2_instance lets you create and manage AWS EC2 instances. However, when it comes to the cloud, there are better solutions. Let's see what they are and how to use them.

Introducing IaC

IaC is the process of managing and provisioning computing resources instead of physical hardware configuration. It is mostly associated with the cloud approach, in which you can request the necessary infrastructure in a programmable manner.

Managing computer infrastructure was always a hard, time-consuming, and error-prone activity. You had to manually place the hardware, connect the network, install the operating system, and take care of its updates. Together with the cloud, things became simple; all you had to do was to write a few commands or make a few clicks in the web UI. IaC goes one step further, as it allows you to specify in a declarative manner what infrastructure you need. To understand it better, let's take a look at the following diagram:

Figure 7.3 – IaC

Figure 7.3 – IaC

You prepare a declarative description of your infrastructure, for example, that you need three servers, a Kubernetes cluster, and a load balancer. Then, you pass this configuration to a tool that uses a cloud-specific API (for example, the AWS API) in order to make sure the infrastructure is as requested. Note that you should store the infrastructure configuration in the source code repository, and you can create multiple identical environments from the same configuration.

You can see that the IaC idea is very similar to configuration management; however, while configuration management makes sure your software is configured as specified, IaC makes sure that your infrastructure is configured as specified.

Now, let's look into the benefits of using IaC.

Benefits of IaC

There are a number of benefits that infrastructure brings into all DevOps activities. Let's walk through the most important ones:

  • Speed: Creating the whole infrastructure means nothing more than running a script, which significantly reduces the time needed before we can start deploying the applications.
  • Cost reduction: Automating the infrastructure provisioning reduces the number of DevOps team members required to operate server environments.
  • Consistency: IaC configuration files become the single point of truth, so they guarantee that every created environment is exactly the same.
  • Risk reduction: Infrastructure configuration is stored in the source code repository and follows the standard code review process, which reduces the probability of making a mistake.
  • Collaboration: Multiple people can share the code and work on the same configuration files, which increases work efficiency.

I hope these points have convinced you that IaC is a great approach. Let's now look into the tools you can use for IaC.

Tools for IaC

When it comes to IaC, there are a number of tools you can use. The choice depends on the cloud provider you use and on your own preferences. Let's walk through the most popular solutions:

  • Terraform: The most popular IaC tool on the market. It's open source and uses plugin-based modules called providers to support different infrastructure APIs. Currently, more than 1,000 Terraform providers exist, including AWS, Azure, GCP, and DigitalOcean.
  • Cloud provider specific: Each major cloud provider has its own IaC tool:
    • AWS CloudFormation: An Amazon service that allows you to specify AWS resources in the form of YAML or JSON template files
    • Azure Resource Manager (ARM): A Microsoft Azure service that allows you to create and manage Azure resources with the use of ARM template files
    • Google Cloud Deployment Manager: A Google service that allows you to manage Google Cloud Platform resources with the use of YAML files
  • General configuration management: Ansible, Chef, and Puppet all provide dedicated modules to provision the infrastructure in the most popular cloud solutions.
  • Pulumi: A very flexible tool that allows you to specify the desired infrastructure using general-purpose programming languages, such as JavaScript, Python, Go, or C#.
  • Vagrant: Usually associated with virtual machine management, it provides a number of plugins to provision infrastructure using AWS and other cloud providers.

Of all the solutions mentioned, Terraform is by far the most popular. That is why we'll spend some more time understanding how it works.

Introduction to Terraform

Terraform is an open source tool created and maintained by HashiCorp. It allows you to specify your infrastructure in the form of human-readable configuration files. Similar to Ansible, it works in a declarative manner, which means that you specify the expected outcome, and Terraform makes sure your environment is created as specified.

Before we dive into a concrete example, let's spend a moment understanding how Terraform works.

Understanding Terraform

Terraform reads a configuration file and adjusts the cloud resources accordingly. Let's look at the following diagram, which presents this process:

Figure 7.4 – Terraform workflow

Figure 7.4 – Terraform workflow

A user creates Configuration File and starts the Terraform tool. Then, Terraform checks the Terraform State and uses Terraform Provider to translate the declarative configuration file into the requests called against Target API, which is specific for the given cloud provider. As an example, we can think of a configuration file that defines three AWS EC2 instances. Terraform uses the AWS provider, which executes requests to the AWS API to make sure that three AWS EC2 instances are created.

Information

There are more than 1,000 Terraform providers available, and you can browse them via the Terraform Registry at https://registry.terraform.io/.

The Terraform workflow always consists of three stages:

  • Write: User defines cloud resources as a configuration file.
  • Plan: Terraform compares the configuration file with the current state and prepares the execution plan.
  • Apply: User approves the plan and Terraform executes the planned operations using the cloud API.

This approach is very convenient because, with the plan stage, we can always check what Terraform is going to change in our infrastructure, before actually applying the change.

Now that we understand the idea behind Terraform, let's look at how it all works in practice, starting from the Terraform installation process.

Installing Terraform

The installation process depends on the operating system. In the case of Ubuntu, you can execute the following commands:

$ curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -

$ sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"

$ sudo apt-get update

$ sudo apt-get install terraform

Information

You can find the installation guides for all the operating systems on the official Terraform website, at https://www.terraform.io/downloads.

After the installation process, we can verify that the terraform command works correctly:

$ terraform version

Terraform v1.1.5

After Terraform is configured, we can move to the Terraform example.

Using Terraform

As an example, let's use Terraform to provision an AWS EC2 instance. For this purpose, we need to first configure AWS.

Configuring AWS

To access AWS from your machine, you will need the following:

Let's configure the AWS CLI with the following command:

$ aws configure

The AWS command prompts your AWS access key ID and AWS secret access key.

Information

For instructions on how to create an AWS access key pair, please visit https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-creds.

After these steps, access to your AWS account is configured and we can start playing with Terraform.

Writing Terraform configuration

In a fresh directory, let's create the main.tf file and add the following content:

terraform {

  required_version = ">= 1.1"                 (1)

  required_providers {

    aws = {                                   (2)

      source  = "hashicorp/aws"

      version = "~> 3.74"

    }

  }

}

provider "aws" {

  profile = "default"                         (3)

  region  = "us-east-1"                       (4)

}

resource "aws_instance" "my_instance" {       (5)

  ami           = "ami-04505e74c0741db8d"     (6)

  instance_type = "t2.micro"                  (7)

}

In the preceding configuration, we defined the following parts:

  1. The Terraform tool version should be at least 1.1.
  2. The configuration uses the hashicorp/aws provider:
    • The provider version needs to be at least 3.74.
    • Terraform will automatically download it from the Terraform Registry.
  3. The credentials for the aws provider are stored in the default location created by the AWS CLI.
  4. The provider creates all resources in the us-east-1 region.
  5. The provider creates aws_instance (an AWS EC2 instance) named my_instance.
  6. An EC2 instance is created from ami-04505e74c0741db8d (Ubuntu 20.04 LTS in the us-east-1 region).
  7. The instance type is t2.micro.

You can see that the whole configuration is declarative. In other words, we define what we want, not the algorithm for how to achieve it.

When the configuration is created, we need to download the required provider from the Terraform Registry.

Initializing Terraform configuration

Let's execute the following command:

$ terraform init

This command downloads all required providers and stores them in the .terraform directory. Now, let's finally apply the Terraform configuration.

Applying Terraform configuration

Before we make any Terraform changes, it's good to first execute terraform plan to check what changes stand ahead of us:

$ terraform plan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:

  + create

...

We can see that by applying the configuration, we will create a resource in our infrastructure as described in the console output.

Let's now apply our configuration:

$ terraform apply

...

Do you want to perform these actions?

  Terraform will perform the actions described above.

  Only 'yes' will be accepted to approve.

  Enter a value: yes

...

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

After confirming the change, you should see a lot of logs and the last Apply complete! message, which means that our infrastructure is created.

Now, let's verify that everything is as expected.

Verifying the infrastructure

From the Terraform perspective, we can execute the following command to see the state of our infrastructure:

$ terraform show

# aws_instance.my_instance:

resource "aws_instance" "my_instance" {

...

}

This prints all the information about the resource we created.

Information

Terraform, the same as Ansible, favors idempotent operations. That is why, if we execute terraform plan or terraform apply again, nothing will change. You will only see the following message: No changes. Your infrastructure matches the configuration.

We can now verify that our AWS EC2 instance is really created. Since we already installed the AWS CLI, we can check it with the following command:

$ aws ec2 describe-instances --region us-east-1

{

    "Reservations": [

        {

            "Groups": [],

            "Instances": [

                {

                    "AmiLaunchIndex": 0,

                    "ImageId": "ami-04505e74c0741db8d",

                    "InstanceId": "i-053b633c810728a97",

                    "InstanceType": "t2.micro",

...

If you prefer, you can also check in the AWS web console that the instance is created.

Figure 7.5 – AWS EC2 instance created with Terraform

Figure 7.5 – AWS EC2 instance created with Terraform

We just verified that our Terraform configuration works as expected.

Tip

When working together with Ansible, we can make use of Ansible's dynamic inventories and let Ansible discover created EC2 instances. Read more at https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html.

To make our example complete, let's also see how to delete created resources.

Destroying the infrastructure

Let's remove the resources we created with the following command:

$ terraform destroy

aws_instance.my_instance: Refreshing state... [id=i-053b633c810728a97]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:

  - destroy

...

Do you really want to destroy all resources?

  Terraform will destroy all your managed infrastructure, as shown above.

  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

...

Destroy complete! Resources: 1 destroyed.

After the user confirmation, Terraform removed all the resources. You can check that our AWS EC2 instance does not exist anymore.

As the last thing with Terraform, let's see how it interacts with Kubernetes.

Terraform and Kubernetes

There are two different use cases when it comes to the interaction between Terraform and Kubernetes:

  • Provisioning a Kubernetes cluster
  • Interacting with a Kubernetes cluster

Let's present them one by one.

Provisioning a Kubernetes cluster

Each of the major cloud providers offers managed Kubernetes clusters, and we can provision them using Terraform. The following Terraform providers are available:

  • AWS: This can provision clusters in Amazon Elastic Kubernetes Service (EKS).
  • Google: This can provision clusters in Google Kubernetes Engine (GKE).
  • AzureRM: This can provision clusters in Azure Kubernetes Service (AKS).

Using each of these providers is relatively simple and works similarly to how we described in our Terraform example.

Tip

If you install Kubernetes on bare-metal servers, you should use a configuration management tool, such as Ansible. To provision a cloud-managed Kubernetes cluster, you can use either Ansible or Terraform, but the former is a better fit.

Let's also look at the second usage of Terraform with Kubernetes.

Interacting with a Kubernetes cluster

Similar to Ansible, we can use Terraform to interact with a Kubernetes cluster. In other words, instead of applying Kubernetes configurations using the kubectl command, we can use a dedicated Terraform Kubernetes provider.

A sample Terraform configuration to change Kubernetes resources looks as follows:

resource "kubernetes_namespace" "example" {

  metadata {

    name = "my-first-namespace"

  }

}

The preceding configuration creates a namespace called my-namespace in the Kubernetes cluster.

Tip

There are multiple ways you can interact with a Kubernetes cluster: kubectl, Ansible, Terraform, or some other tool. As a rule of thumb, I would always first try the simplest approach, which is the kubectl command, and only incorporate Ansible or Terraform if you have some special requirements; for example, you manage multiple Kubernetes clusters at the same time.

We covered the basics of Terraform, so let's wrap up this chapter with a short summary.

Summary

We have covered configuration management and IaC approaches, together with the related tooling. Note that whether you should use Ansible, Terraform, or neither of them inside your continuous delivery pipeline highly depends on your particular use case.

Ansible shines when you have multiple bare-metal servers to manage, so if your release means making the same change into many servers at the same time, you'll most probably place Ansible commands inside your pipeline.

Terraform works best when you use the cloud. Therefore, if your release means making a change to your cloud infrastructure, then Terraform is the way to go.

However, if your environment is only a single Kubernetes cluster, then there is nothing wrong with executing kubectl commands inside your pipeline.

The other takeaway points from this chapter are as follows:

  • Configuration management is the process of creating and applying the configurations of the application.
  • Ansible is one of the most trending configuration management tools. It is agentless, and therefore, it requires no special server configuration.
  • Ansible can be used with ad hoc commands, but the real power lies in Ansible playbooks.
  • The Ansible playbook is a definition of how the environment should be configured.
  • The purpose of Ansible roles is to reuse parts of playbooks.
  • Ansible Galaxy is an online service to share Ansible roles.
  • IaC is a process of managing cloud resources.
  • Terraform is the most popular tool for IaC.

In the next chapter, we will wrap up the continuous delivery process and complete the final Jenkins pipeline.

Exercises

In this chapter, we covered the fundamentals of Ansible and ways to use it with Docker and Kubernetes. As exercises, try the following tasks:

  1. Create the server infrastructure and use Ansible to manage it:
    1. Connect a physical machine or run a VirtualBox machine to emulate the remote server.
    2. Configure SSH access to the remote machine (SSH keys).
    3. Install Python on the remote machine.
    4. Create an Ansible inventory with the remote machine.
    5. Run the Ansible ad hoc command (with the ping module) to check that the infrastructure is configured correctly.
  2. Create a Python-based hello world web service and deploy it in a remote machine using Ansible playbook:
    1. The service can look exactly the same as we described in the exercises for the chapter.
    2. Create a playbook that deploys the service into the remote machine.
    3. Run the ansible-playbook command and check whether the service was deployed.
  3. Provision a GCP virtual machine instance using Terraform:
    1. Create an account in GCP.
    2. Install the gcloud tool and authenticate (gcloud init).
    3. Generate credentials and export them into the GOOGLE_APPLICATION_CREDENTIALS environment variable.
    4. Create a Terraform configuration that provisions a virtual machine instance.
    5. Apply the configuration using Terraform.
    6. Verify that the instance was created.

Questions

To verify your knowledge from this chapter, please answer the following questions:

  1. What is configuration management?
  2. What does it mean that the configuration management tool is agentless?
  3. What are the three most popular configuration management tools?
  4. What is Ansible inventory?
  5. What is the difference between Ansible ad hoc commands and playbooks?
  6. What is an Ansible role?
  7. What is Ansible Galaxy?
  8. What is IaC?
  9. What are the most popular tools for IaC?

Further reading

To read more about configuration management and IaC, please refer to the following resources:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.164.243