Ansible is not a full-fledged programming language, but it does have several features of one, and one of the most important of these is variable substitution, or using the values of variables in strings or in other variables. This chapter presents Ansible’s support for variables in more detail, including a certain type of variable that Ansible calls a fact.
The simplest way to define variables is to put a vars
section in your playbook with the names and values of your variables. Recall from Example 2-8 that we used this approach to define several configuration-related variables, like this:
vars: tls_dir: /etc/nginx/ssl/ key_file: nginx.key cert_file: nginx.crt conf_file: /etc/nginx/sites-available/default server_name: localhost
Ansible also allows you to put variables into one or more files, using a section called vars_files
. Let’s say you want to take the preceding example and put the variables in a file named nginx.yml instead of putting them right in the playbook. You would replace the vars
section with a vars_files
that looks like this:
vars_files: - nginx.yml
The nginx.yml file would look like Example 4-1.
key_file: nginx.key cert_file: nginx.crt conf_file: /etc/nginx/sites-available/default server_name: localhost
You’ll see an example of vars_files
in action in Chapter 6 when we use it to separate out the variables that hold sensitive information.
As we discussed in Chapter 3, Ansible also lets you define variables associated with hosts or groups in the inventory. You’ll do this in separate directories that live alongside either the inventory hosts file or your playbooks.
For debugging, it’s often handy to be able to view the output of a variable. You saw in Chapter 2 how to use the debug
module to print out an arbitrary message. You can also use it to output the value of the variable. It works like this:
- debug: var=myvarname
This shorthand notation, without a name and in pure-YAML style, is practical in development. We’ll use this form of the debug
module several times in this chapter. We typically remove debug statements before going to production.
Often, you’ll need to set the value of a variable based on the result of a task. Remember that each ansible module returns results in JSON format. To use these results, you create a registered variable using the register
clause when invoking a module. Example 4-2 shows how to capture the output of the whoami
command to a variable named login
.
- name: capture output of whoami command command: whoami register: login
In order to use the login
variable later, you need to know the type of value to expect. The value of a variable set using the register
clause is always a dictionary, but the specific keys of the dictionary will be different depending on the module that you use.
Unfortunately, the official Ansible module documentation doesn’t contain information about what the return values look like for each module. It does often mention examples that use the register
clause, which can be helpful. I’ve found the simplest way to find out what a module returns is to register a variable and then output that variable with the debug module.
Let’s say we run the playbook shown in Example 4-3.
--- - name: show return value of command module hosts: fedora gather_facts: false tasks: - name: capture output of id command command: id -un register: login - debug: var=login - debug: msg="Logged in as user {{ login.stdout }}"
The output of the debug module looks like this:
TASK [debug] ******************************************************************* ok: [fedora] => { "login": { "changed": true, "cmd": [ "id", "-un" ], "delta": "0:00:00.002262", "end": "2021-05-30 09:25:41.696308", "failed": false, "rc": 0, "start": "2021-05-30 09:25:41.694046", "stderr": "", "stderr_lines": [], "stdout": "vagrant", "stdout_lines": [ "vagrant" ] } }
The changed key is present in the return value of all Ansible modules, and Ansible uses it to determine whether a state change has occurred. For the command
and shell
module, this will always be set to true
unless overridden with the changed_when
clause, which we cover in Chapter 8.
The cmd
key contains the invoked command as a list of strings.
The rc
key contains the return code. If it is nonzero, Ansible will assume the task failed to execute.
The stderr
key contains any text written to standard error, as a single string.
The stdout
key contains any text written to standard out, as a single string.
The stdout_lines
key contains any text written to split by newline. It is a list, and each element of the list is a line of output.
If you’re using the register
clause with the command
module, you’ll likely want access to the stdout
key, as shown in Example 4-4.
- name: capture output of id command command: id -un register: login - debug: msg="Logged in as user {{ login.stdout }}"
Sometimes it’s useful to do something with the output of a failed task: for instance, when running a program fails. However, if the task fails, Ansible will stop executing tasks for the failed host. You can use the ignore_errors
clause, as shown in Example 4-5, so Ansible does not stop on the error. That allow you to print the program’s output.
- name: run myprog command: /opt/myprog register: result ignore_errors: true - debug: var=result
The shell
module has the same output structure as the command
module, but other modules have different keys.
Example 4-6 shows the relevant piece of the output of the stat
module that collects properties of a file.
TASK [display result.stat] *************************************************************************************** ok: [ubuntu] => { "result.stat": { "atime": 1622724660.888851, "attr_flags": "e", "attributes": [ "extents" ], "block_size": 4096, "blocks": 8, "charset": "us-ascii", "checksum": "7df51a4a26c00e5b204e547da4647b36d44dbdbf", "ctime": 1621374401.1193385, "dev": 2049, "device_type": 0, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 784, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0644", "mtime": 1621374219.5709288, "nlink": 1, "path": "/etc/ssh/sshd_config", "pw_name": "root", "readable": true, "rgrp": true, "roth": true, "rusr": true, "size": 3287, "uid": 0, "version": "1324051592", "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false } }
The results from the stat module
tell you everything there is to know about a file.
If a variable contains a dictionary, you can access the keys of the dictionary by using either a dot (.) or a subscript ([]). Example 4-6 has a variable reference that uses dot notation:
{{ result.stat }}
We could have used subscript notation instead:
{{ result[ stat ] }}
This rule applies to multiple dereferences, so all of the following are equivalent:
result['stat']['mode'] result['stat'].mode result.stat['mode'] result.stat.mode
Bas prefers dot notation, unless the key is a string that holds a character that’s not allowed as a variable name, such as a dot, space, or hyphen.
Ansible uses Jinja2 to implement variable dereferencing, so for more details on this topic, see the Jinja2 documentation on variables (https://jinja.palletsprojects.com/en/3.0.x/templates/#variables).
If your playbooks use registered variables, make sure you know the content of those variables, both for cases where the module changes the host’s state and for when the module doesn’t change the host’s state. Otherwise, your playbook might fail when it tries to access a key in a registered variable that doesn’t exist.
As you’ve already seen, when Ansible runs a playbook, before the first task runs, this happens:
TASK [Gathering Facts] ********************************************************* ok: [debian] ok: [fedora] ok: [ubuntu]
When Ansible gathers facts, it connects to the hosts and queries it for all kinds of details about the hosts: CPU architecture, operating system, IP addresses, memory info, disk info, and more. This information is stored in variables that are called facts, and they behave just like any other variable.
Here’s a playbook that prints out the operating system details of each server:
--- - name: 'Ansible facts.' hosts: all gather_facts: true tasks: - name: Print out operating system details debug: msg: >- os_family: {{ ansible_os_family }}, distro: {{ ansible_distribution }} {{ ansible_distribution_version }}, kernel: {{ ansible_kernel }}
Here’s what the output looks like for the virtual machines running Debian, Fedora, and Ubuntu:
PLAY [Ansible facts.] ********************************************************** TASK [Gathering Facts] ********************************************************* ok: [debian] ok: [fedora] ok: [ubuntu] TASK [Print out operating system details] ************************************** ok: [ubuntu] => { "msg": "os_family: Debian, distro: Ubuntu 20.04, kernel: 5.4.0-73-generic" } ok: [fedora] => { "msg": "os_family: RedHat, distro: Fedora 34, kernel: 5.11.12-300.fc34.x86_64" } ok: [debian] => { "msg": "os_family: Debian, distro: Debian 10, kernel: 4.19.0-16-amd64" } PLAY RECAP ********************************************************************* debian : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 fedora : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ubuntu : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Ansible implements fact collecting through the use of a special module called the setup
module. You don’t need to call this module in your playbooks because Ansible does that automatically when it gathers facts. However, you can invoke it manually with the ansible
command-line tool, like this:
$ ansible ubuntu -m setup
When you do this, Ansible will output all of the facts, as shown in Example 4-8.
ubuntu | SUCCESS => { "ansible_facts": { "ansible_all_ipv4_addresses": [ "192.168.4.10", "10.0.2.15" ], "ansible_all_ipv6_addresses": [ "fe80::a00:27ff:fef1:d47", "fe80::a6:4dff:fe77:e100" ], (many more facts)
Note that the returned value is a dictionary whose key is ansible_facts
and whose value is a dictionary that has the names and values of the actual facts.
Because Ansible collects so many facts, the setup
module supports a filter
parameter that lets you filter by fact name, or by specifying a glob. (A glob is what shells use to match file patterns, such as *.txt.) The filter option filters only the first level subkey below ansible_facts.
$ ansible all -m setup -a 'filter=ansible_all_ipv6_addresses'
The output looks like this:
debian | SUCCESS => { "ansible_facts": { "ansible_all_ipv6_addresses": [ "fe80::a00:27ff:fe8d:c04d", "fe80::a00:27ff:fe55:2351" ] }, "changed": false } fedora | SUCCESS => { "ansible_facts": { "ansible_all_ipv6_addresses": [ "fe80::505d:173f:a6fc:3f91", "fe80::a00:27ff:fe48:995" ] }, "changed": false } ubuntu | SUCCESS => { "ansible_facts": { "ansible_all_ipv6_addresses": [ "fe80::a00:27ff:fef1:d47", "fe80::a6:4dff:fe77:e100" ] }, "changed": false }
Using a filter helps with finding the main details of a machine’s setup.
If you look closely at Example 4-8, you’ll see that the output is a dictionary whose key is ansible_facts
. The use of ansible_facts
in the return value is an Ansible idiom. If a module returns a dictionary that contains ansible_facts
as a key, Ansible will create variable names in the environment with those values and associate them with the active host.
For modules that return facts, there’s no need to register variables, since Ansible creates these variables for you automatically. In Example 4-9, the following task uses the service_facts
module to retrieve facts about services, then prints out the part about the secure shell daemon. (Note the subscript notation—that’s due to the embedded dot.)
- name: show a fact returned by a module hosts: debian gather_facts: false tasks: - name: get services facts service_facts: - debug: var=ansible_facts.services['sshd.service']
The output looks like this.
TASK [debug] ******************************************************************* ok: [debian] => { "ansible_facts.services['sshd.service']": { "name": "sshd.service", "source": "systemd", "state": "active", "status": "enabled" } }
Note that we do not need to use the register
keyword when invoking service_facts
, since the returned values are facts. Several modules that ship with Ansible return facts.
Ansible provides an additional mechanism for associating facts with a host. You can place one or more files on the remote host machine in the /etc/ansible/facts.d directory. Ansible will recognize the file if it is:
in .ini format
in JSON format
an executable that takes no arguments and outputs JSON on the console
These facts are available as keys of a special variable named ansible_local
.
For instance, Example 4-10 shows a fact file in .ini format.
[book] title=Ansible: Up and Running authors=Meijer, Hochstein, Moser publisher=O'Reilly
If you copy this file to /etc/ansible/facts.d/example.fact on the remote host, you can access the contents of the ansible_local
variable in a playbook:
- name: print ansible_local debug: var=ansible_local - name: print book title debug: msg="The title of the book is {{ ansible_local.example.book.title }}"
The output of these tasks looks like this:
TASK [print ansible_local] ***************************************************** ok: [fedora] => { "ansible_local": { "example": { "book": { "authors": "Meijer, Hochstein, Moser", "publisher": "O'Reilly", "title": "Ansible: Up and Running" } } } } TASK [print book title] ******************************************************** ok: [fedora] => { "msg": "The title of the book is Ansible: Up and Running" }
Note the structure of the value in the ansible_local
variable. Because the fact file is named example.fact, the ansible_local
variable is a dictionary that contains a key named example
.
Ansible also allows you to set a fact (effectively the same as defining a new variable) in a task by using the set_fact
module. I often like to use set_fact
immediately after service_facts
to make it simpler to refer to a variable.
- name: set nginx_state when: ansible_facts.services['nginx.service'] is defined set_fact: nginx_state: "{{ ansible_facts.services[' nginx.service']['state'] }}"
Example 4-11 demonstrates how to use set_fact
so that a variable can be referred to as nginx_state
instead of ansible_facts.services[' nginx.service']['state']
.
Ansible defines several variables that are always available in a playbook. Some are shown in Table 4-1.
Parameter | Description |
---|---|
hostvars | A dict whose keys are Ansible hostnames and values are dicts that map variable names to values |
inventory_hostname | Fully qualified domain name of the current host as known by Ansible (e.g., myhost.example.com ) |
inventory_hostname_short | Name of the current host as known by Ansible, without the domain name (e.g., myhost ) |
group_names | A list of all groups that the current host is a member of |
groups | A dict whose keys are Ansible group names and values are a list of hostnames that are members of the group. Includes all and ungrouped groups: {“all ”: [...], “web”: [...], “ungrouped ”: [...]} |
ansible_check_mode | A boolean that is true when running in check mode (see “Check Mode”) |
ansible_play_batch | A list of the inventory hostnames that are active in the current batch (see “Running on a Batch of Hosts at a Time”) |
ansible_play_hosts | A list of all of the inventory hostnames that are active in the current play |
ansible_version | A dict with Ansible version info: {“full ”: 2.3.1.0 ”, “major ”: 2 , “minor ”: 3 , “revision ”: 1 , “string ”: “2.3.1.0 ”} |
The hostvars
, inventory_hostname
, and groups
variables merit some additional discussion.
In Ansible, variables are scoped by host. It only makes sense to talk about the value of a variable relative to a given host.
The idea that variables are relative to a given host might sound confusing, since Ansible allows you to define variables on a group of hosts. For example, if you define a variable in the vars
section of a play, you are defining the variable for the set of hosts in the play. But what Ansible is really doing is creating a copy of that variable for each host in the group.
Sometimes, a task that’s running on one host needs the value of a variable defined on another host. Say you need to create a configuration file on web servers that contains the IP address of the eth1 interface of the database server, and you don’t know in advance what this IP address is. This IP address is available as the ansible_eth1.ipv4.address fact for the database server.
The solution is to use the hostvars
variable. This is a dictionary that contains all of the variables defined on all of the hosts, keyed by the hostname as known to Ansible. If Ansible has not yet gathered facts on a host, you will not be able to access its facts by using the hostvars
variable, unless fact caching is enabled.1
Continuing our example, if our database server is db.example.com, then we could put the following in a configuration template:
{{ hostvars['db.example.com'].ansible_eth1.ipv4.address }}
This evaluates to the ansible_eth1.ipv4.address fact associated with the host named db.example.com.
The inventory_hostname
is the hostname of the current host, as known by Ansible. If you have defined an alias for a host, this is the alias name. For example, if your inventory contains a line like this:
ubuntu ansible_host=192.168.4.10
then inventory_hostname
would be ubuntu
.
You can output all of the variables associated with the current host with the help of the hostvars
and inventory_hostname
variables:
- debug: var=hostvars[inventory_hostname]
The groups
variable can be useful when you need to access variables for a group of hosts. Let’s say we are configuring a load-balancing host, and our configuration file needs the IP addresses of all of the servers in our web group. Our configuration file contains a fragment that looks like this:
backend web-backend {% for host in groups.web %} server {{ hostvars[host].inventory_hostname }} {{ hostvars[host].ansible_default_ipv4.address }}:80 {% endfor %}
The generated file looks like this:
backend web-backend server georgia.example.com 203.0.113.15:80 server newhampshire.example.com 203.0.113.25:80 server newjersey.example.com 203.0.113.38:80
With the groups variable you can iterate over hosts in a group in a configuration file template, only by using the group name. You can change the hosts in the group without changing the configuration file template.
Variables set by passing -e var=value
to ansible-playbook
have the highest precedence, which means you can use this to override variables that are already defined. Example 4-12 shows how to set the value of the variable named greeting
to the value hiya
.
$ ansible-playbook 4-12-greet.yml -e greeting=hiya
Use the ansible-playbook -e variable=value
method when you want to use a playbook as you would a shell script that takes a command-line argument. The -e
flag effectively allows you to pass variables as arguments.
Example 4-13 shows the playbook that outputs a message specified by a variable.
--- - name: pass a message on the command line hosts: localhost vars: greeting: "you didn't specify a message" tasks: - name: output a message debug: msg: "{{ greeting }}"
You can invoke it like this:
$ ansible-playbook 4-12-greet.yml -e greeting=hiya
The output will look like this:
PLAY [pass a message on the command line] ************************************** TASK [Gathering Facts] ********************************************************* ok: [localhost] TASK [output a message] ******************************************************** ok: [localhost] => { "msg": "hiya" } PLAY RECAP ********************************************************************* localhost : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
If you want to put a space in the variable, you need to use quotes like this:
$ ansible-playbook greet.yml -e 'greeting="hi there"'
You have to put single quotes around the entire 'greeting="hi there"'
so that the shell interprets that as a single argument to pass to Ansible, and you have to put double quotes around "hi there"
so that Ansible treats that message as a single string.
Ansible also allows you to pass a file containing the variables instead of passing them directly on the command line by passing @filename.yml
as the argument to -e
; for example, say you have a file that looks like Example 4-14.
greeting: hiya
You can pass this file to the command line like this:
$ ansible-playbook 4-12-greet.yml -e @4-13-greetvars.yml
Example 4-15 shows a simple technique to display any variable
given with the -e
flag on the command line.
--- - name: show any variable during debugging. hosts: all gather_facts: false tasks: - debug: var="{{ variable }}"
Using this technique effectively gives you a “variable variable” that you can use for debugging.
We’ve covered several ways of defining variables. It is possible to define the same variable multiple times for a host, using different values. Avoid this when you can, but if you can’t, then keep in mind Ansible’s precedence rules. When the same variable is defined in multiple ways, the precedence rules determine which value wins (or overrides).
Ansible does apply variable precedence, and you might have a use for it. Here is the order of precedence, from least to greatest. The last listed variables override all other variables:
command line values (for example, -u my_user, these are not variables)
role defaults (defined in role/defaults/main.yml) 1
inventory file or script group vars 2
inventory group_vars/all 3
playbook group_vars/all 3
inventory group_vars/* 3
playbook group_vars/* 3
inventory file or script host vars 2
inventory host_vars/* 3
playbook host_vars/* 3
host facts / cached set_facts 4
play vars
play vars_prompt
play vars_files
role vars (defined in role/vars/main.yml)
block vars (only for tasks in block)
task vars (only for the task)
include_vars
set_facts / registered vars
role (and include_role) params
include params
extra vars (for example, -e “user=my_user”)
In this chapter, we covered several ways to define and access variables and facts. Separating variables from tasks and creating inventories with the proper values for the variables allows you to create staging environments for your software. Ansible is very powerful in its flexibility to define data at the appropriate level. The next chapter focuses on a realistic example of deploying an application.
1 See Chapter 11 for information about fact caching.
44.200.94.150