Structuring configuration data in a hierarchy

In the previous section, we reduced the data problem to a simple need for key/value pairs that are specific to each node under Puppet management. Puppet and its manifests then serve as the engine that generates actual configuration from these minimalistic bits of information.

A simplistic approach to this problem is an ini style configuration file that has a section for each node that sets values for all configurable keys. Shared values will be declared in one or more general sections:

[mysql]
buffer_pool=15G
log_file_size=500M
...
[xndp12-sql01.example.net]
psk=xneFGl%23ndfAWLN34a0t9w30.zges4
server_id=1

Rails applications customarily do something similar and store their configuration in a YAML format. The user can define different environments, such as production, staging, and testing. The values that are defined per environment override the global setting values.

This is quite close to the type of hierarchical configuration that Puppet allows through its Hiera binding. The hierarchies that the mentioned Rails applications and ini files achieve through configuration environments are quite flat—there is a global layer and an overlay for specialized configuration. With Hiera and Puppet, a single configuration database will typically handle whole clusters of machines and entire networks of such clusters. This implies the need for a more elaborate hierarchy.

Hiera allows you to define your own hierarchical layers. There are some typical, proven examples, which are found in many configurations out there:

  • The common layer holds default values for all agents
  • A location layer can override some values in accordance with the data center that houses each respective node
  • Each agent machine typically fills a distinct role in your infrastructure, such as wordpress_appserver or puppetdb_server
  • Some configuration is specific to each single machine

For example, consider the configuration of a hypothetical reporting client. Your common layer will hold lots of presets such as default verbosity settings, the transport compression option, and other choices that should work for most machines. On the location layer, you ensure that each machine checks in to the respective local server—reporting should not use WAN resources.

Settings per role are perhaps the most interesting part. They allow fine-grained settings that are specific to a class of servers. Perhaps your application servers should monitor their memory consumption in very close intervals. For the database servers, you will want a closer view at hard drive operations and performance. For your Puppet servers, there might be special plugins that gather specific data.

The machine layer is very useful in order to declare any exceptions from the rule. There are always some machines that require special treatment for one reason or another. With a top hierarchy layer that holds data for each single agent, you get full control over all the data that an agent uses.

These ideas are still quite abstract, so let's finally look at the actual application of Hiera.

Configuring Hiera

The support for retrieving data values from Hiera has been built into Puppet since version 3. All you need in order to get started is a hiera.yaml file in the configuration directory.

Tip

Of course, the location and name of the configuration is customizable, as is almost everything that is related to configuration. Look for the hiera_config setting.

As the filename extension suggests, the configuration is in the YAML format and contains a hash with keys for the backends, the hierarchy, and backend-specific settings. The keys are noted as Ruby symbols with a leading colon:

# /etc/puppetlabs/code/hiera.yaml
:backends:
  - yaml
:hierarchy: 
  - node/%{::clientcert}
  - role/%{::role}
  - location/%{::datacenter}
  - common
:yaml: 
  :datadir: /etc/puppetlabs/code/environments/%{::environment}/hieradata

Note that the value of :backends is actually a single element array. You can pick multiple backends. The significance will be explained later. The :hierarchy value contains a list of the actual layers that were described earlier. Each entry is the name of a data source. When Hiera retrieves a value, it searches each data source in turn. The %{} expression allows you to access the values of Puppet variables. Use only facts or global scope variables here—anything else will make Hiera's behavior quite confusing.

Finally, you will need to include configurations for each of your backends. The configuration above uses the YAML backend only, so there is only a hash for :yaml with the one supported :datadir key. This is where Hiera will expect to find YAML files with data. For each data source, the datadir can contain one .yaml file. As the names of the sources are dynamic, you will typically create more than four or five data source files. Let's create some examples before we have a short discussion on the combination of multiple backends.

Storing Hiera data

The backend of your Hiera setup determines how you have to store your configuration values. For the YAML backend, you fill datadir with files that each holds a hash of values. Let's put some elements of the reporting engine configuration into the example hierarchy:

# /etc/puppetlabs/code/environments/production/hieradata/common.yaml
reporting::server: stats01.example.net
reporting::server_port: 9033

The values in common.yaml are defaults that are used for all agents. They are at the broad base of the hierarchy. Values that are specific to a location or role apply to smaller groups of your agents. For example, the database servers of the postgres role should run some special reporting plugins:

# /etc/puppetlabs/code/environments/production/hieradata/role/postgres.yaml
reporting::plugins:
  - iops
  - cpuload

On such a higher layer, you can also override the values from the lower layers. For example, a role-specific data source such as role/postgres.yaml can set a value for reporting::server_port as well. The layers are searched from the most to the least specific, and the first value is used. This is why it is a good idea to have a node-specific data source at the top of the hierarchy. On this layer, you can override any value for each agent. In this example, the reporting node can use the loopback interface to reach itself:

#/etc/puppetlabs/.../hieradata/node/stats01.example.net.yaml
reporting::server: localhost

Each agent receives a patchwork of configuration values according to the concrete YAML files that make up its specific hierarchy.

Don't worry if all this feels a bit overwhelming. There are more examples in this chapter. Hiera also has the charming characteristic of seeming rather complicated on paper, but it feels very natural and intuitive once you try using it yourself.

Choosing your backends

There are two built-in backends: YAML and JSON. This chapter will focus on YAML, because it's a very convenient and efficient form of data notation. The JSON backend is very similar to YAML. It looks for data in .json files instead of .yaml for each data source; these files use a different data notation format.

The use of multiple backends should never be truly necessary. In most cases, a well-thought-out hierarchy will suffice for your needs. With a second backend, data lookup will traverse your hierarchy once per backend. This means that the lowest level of your primary backend will rank higher than any layer from additional backends.

In some cases, it might be worthwhile to add another backend just to get the ability to define even more basic defaults in an alternative location—perhaps a distributed filesystem or a source control repository with different commit privileges.

Also, note that you can add custom backends to Hiera, so these might also be sensible choices for secondary or even tertiary backends. A Hiera backend is written in Ruby, like the Puppet plugins. The details of creating such a backend are beyond the scope of this book.

Note

A particularly popular backend plugin is eyaml, available through the hiera-eyaml Ruby gem. This backend allows you to incorporate encrypted strings in your YAML data. Puppet decrypts the data upon retrieval.

You have studied the theory of storing data in Hiera at length, so it's finally time to see how to make use of this in Puppet.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.103.183