Sample architectures

We have outlined the main tasks and components we can use to put things together in a Puppet architecture; we have given a look at Foreman, Hiera, and at the roles and profiles pattern; now let's see some real examples based on them.

The default approach

By default, Puppet doesn't use an ENC and lets us classify nodes directly in /etc/puppet/manifests/site.pp (or in files imported from there) with the node statement. So a very basic setup would have site.pp with content like the following:

node www01 {
  # Place here resources to apply to this node in Puppet DSL:
  # file, package service, mount...
}
node lb01 {
  # Resources for this node: file, package service...
}

This is all we need: no modules with their classes, no Hiera, no ENC; just good old plain Puppet code as they teach us in schools, so to speak.

This basic approach, useful just for the first tests, obviously does not scale well and would quickly become a huge mess of duplicated code.

The next step is to use classes which group resources, and if these classes are placed inside modules, we can include them directly without the need to explicitly import the containing files:

node www01 {
  include ::apache
  include ::php
}

Also, this approach, even if definitely cleaner, will quickly be overwhelmed by redundant code, and so we will probably want to introduce grouping classes that group other classes and resources according to the desired logic.

One common example is a class that includes all the modules, classes and resources we want to apply to all our nodes: a general class.

Another example is role classes, which include all the extra resources needed by a particular node:

node www01 {
  include ::general
  include ::role::www
}

We can then have other grouping classes to better organize and reuse our resources, such as the profiles we have just discussed.

Note

Note that with the above names we would need two different local (site) modules: general and role; I personally prefer to place all the local, custom resources in a single module, to be called site or, even better, with the name of the project, customer, or company. Given this, the previous example could be as follows:

node www01 {
  include ::site
  include ::site::role::www
}

These are only naming matters, which have consequences on directories layout and eventually on permissions management on our SCM, but the principle of grouping resources according to custom logic is the same.

Up to now, we have just included classes and often the same classes are included by nodes that need different effects from them, for example, slightly different configuration files or specific resources, or in case of any kind of variation we have in the real world while configuring the same application on different systems.

Here is where we need to use variables and parameters to alter the behavior of a class according to custom needs.

And here is where the complexity begins, because there are various elements to consider, such as the following:

  • Which variables identify our node
  • If they are sufficient to manage all the variations in our nodes
  • Where we want to place our logic that copes with them
  • Where configurations should be provided as plain static files, where it is better to use templates, or where we could just modify single lines inside files
  • How these choices may affect the risk of making a change that affects unexpected nodes

Note

The most frequent and dangerous mistakes with Puppet are due to people making changes in code (or data) that are supposed to be made for a specific node but also affect other nodes. Most of the time this happens when people don't know the structure and logic of the Puppet codebase they are working on well enough. There are no easy rules to prevent such problems, just some general suggestions, such as the following:

  • Promote code peer review and communication among the Puppeteers
  • Test code changes on canary nodes
  • Use naming conventions and coherent code organization to maximize the principle of least surprise
  • Embrace code simplicity, readability and documentation
  • Be wary of the scope and extent of our abstraction layers

We also need classes that actually allow things to be changed, via parameters or variables, if we want to avoid placing our logic directly inside them.

Patterns on how to manage variables and their effect on the applied resources have changed a lot with the evolution of Puppet and the introduction of new tools and functionalities.

We won't indulge in how things were done in the good old days. In a modern, and currently recommended Puppet setup, we expect to have the following:

  • At least Puppet 3 on the Puppet Master to eventually enjoy data bindings
  • Classes that expose parameters that allow us to manage them
  • Reusable public modules that allow us to adapt them to our use case without modifications

In this case, we can basically follow two different approaches:

We can keep on including classes and set on Hiera the values we want for the parameters we need to modify. So, in our example we could have in site/manifests/role/www.pp something like the following:

class site::role::www {
  include ::apache
  include ::php
}

The same is true on Hiera for a file like hieradata/env/devel.yaml, where we set parameters like the following:

---
  apache::port: 8080

Alternatively, we might use explicit class declarations such as:

class site::role::www {
  $apache_port = $env ? {
    devel   => '8080',
    default => '80',
  }
  class { '::apache':
    port => $apache_port,
  }
  include ::php
}

Data, and logic on how to determine it, is definitively inside code.

Basic ENC, logic on site module, Hiera backend

The ENC and Hiera can be alternative or complementary; this approach gets advantages from both and uses the site module for most of the logic for class inclusion, the configuration files, and the custom resources.

In Hiera, all the class parameters are placed.

In the ENC, when not possible via facts, we set the variables that identify our nodes, and can be used on our Hiera hierarchy.

In site.pp in the same ENC, we include just a single site class and here we manage our grouping logic. For example, with a general baseline and role classes:

class site {
  include ::site::general
  if $::role {
    include "::site::roles::${::role}"
  }
}

In our role classes, which are included if the $role variable is set on the ENC, we can manage all the role-specific resources, eventually dealing with differences according to environment or other identifying variables directly in the role class, or using profiles.

Note

Note that in this chapter we've always referred to class names with their full name, so a class such as mysql is referred with ::mysql. This is useful for avoiding name collisions when, for example, role names may clash with existing modules. If we don't use the leading :: chars, we will have problems, for example, with a class called site::role::mysql, which may mess with the main class mysql.

The Foreman and Hiera

The Foreman can act as a Puppet ENC; it's probably the most common ENC around and we can use both Foreman and Hiera in our architecture.

In this case we should strictly separate responsibilities and scopes, even if they might be overlapping. Let's review our components and how they might fit in a scenario based on the Foreman, Hiera, and the usual site module(s):

  • Classes to include in nodes: This can be done on the Foreman, the site module, or both. It mostly depends on how much logic we want on the Foreman, and so how many activities have to be done via its interface and how many are moved into site module(s). We can decide to define roles and profiles on the site module and use Foreman just to define top scope variables and the inclusion of a single basic class, as in the previous example. Alternatively, we may prefer to use Foreman's HostGroups to classify and group nodes, moving into Foreman most of the classes grouping logic.
  • Variables to assign to nodes: This can be done on Foreman and Hiera. It probably makes sense to set on Foreman only the variables we need to identify nodes (if they are not provided by facts) and generally the ones we might need to use on the Hiera's hierarchy. All the other variables and the logic on how to derive them should stay on Hiera.
  • Files should stay on our site module or, eventually, on Hiera (with the hiera-file plugin).

Hiera-based setup

A common scenario involves the usage of Hiera to manage both the classes to include in nodes and their parameters.

No ENC is used; site.pp just needs the following, together, eventually, with a few handy resource defaults:

hiera_include('classes')

Classes and parameters can be assigned to nodes enjoying the flexibility of our hierarchy, so in a common.yaml we can have the following:

---
# Common classes on all nodes
classes:
  - puppet
  - openssh
  - timezone
  - resolver
# Common Class Settings
timezone::timezone: 'Europe/Rome'
resolver::dns_servers:
  - 8.8.8.8
  - 8.8.4.4

In a specific data source file such as role/web.yaml, add the classes and the parameters we want to apply to that group of nodes:

---
classes:
  - stack_lamp
stack_lamp::apache_include: true
stack_lamp::php_include: true
stack_lamp::mysql_include: false

The modules used (here a sample stack_lamp, but it could be something such as profile::webserver or apache and php) should expose parameters that are needed to configure things as expected.

They should also allow creation of custom resources, such as apache::vhost, by providing hashes to feed a create_resources() function present in a used class.

Configuration files and templates can be placed in a site module with, eventually, additional custom classes.

We can also use the hiera-file plugin to deliver configuration files having a Hiera-only setup. This is a somewhat extreme approach. Everything is managed by Hiera: the classes to include in nodes, their parameters, and also the configuration files to serve to clients. Also, here we need modules and relevant classes that expose parameters to manage the content of files.

Secrets, credentials, and sensitive data may be encrypted via hiera-eyaml or hiera-gpg.

We may wonder if a site module is still needed, since most of its common functions (providing custom files, managing logic, defining and managing variables) can be moved to Hiera.

The answer is probably yes; even in a similar, strongly Hiera-oriented scenario, a site module is probably needed. We might for example use custom classes to manage edge cases or exceptions that could be difficult to replicate with Hiera without adding a specific entry in the hierarchy.

One important point to consider when we move most of our logic to Hiera is how much this costs in terms of hierarchy size. Sometimes a simple (even if not elegant) custom class that deals with a particular case may save us from adding a layer in the hierarchy.

Foreman smart variables

This is the Foreman alternative approach to Hiera for the full management of the variables used by nodes.

Foreman can automatically detect the parameters exposed by classes and allows us to set values for them according to custom logic, providing them as parameters for parameterized classes via the ENC functionality (support for parameterized classes via ENC has been available since Puppet 2.6.5).

To each class we can map one or more smart variables, which may have different values according to different, customizable conditions and hierarchies.

The logic is somewhat similar to Hiera, with the notable difference that we can have a different hierarchy for each variable and have other ways to define its content via matchers.

User experience benefits from web interface and may turn out to be easier than editing Hiera files directly. Foreman auditing features allow us to track changes as would a SCM on plain files.

We don't have the multiple backend flexibility that we have on Hiera and we'll be completely tied to Foreman for the management on our nodes.

Personally, I have no idea how many people are extensively using smart variables in their setups; just be aware that there exists this alternative for data management.

Facts-driven truths

A facts driven approach was theorized by Jordan Sissel, Logstash's author in a 2010 blog post (http://www.semicomplete.com/blog/geekery/puppet-nodeless-configuration). The most authoritative information we can have about a node comes from its own facts.

We may decide to use facts in various places: in our hierarchy, in our site code, in templates, and if our facts permit us to identify the node's role, environment, zone, or any identifying variable, we might not even need node classification and manage all our stuff in our site module or on Hiera.

It is now very easy to add custom facts placing a file in the node's /etc/facter/facts.d directory. This can be done, for instance, by a (cloud) provisioning script.

Alternatively, if our nodes' names are standardized and informative, we can easily define our identifying variables in facts that might be provided by our site module.

If all the variables that identify our node come from facts, we can have in our site.pp a single line as simple as the following:

include site

In our site/manifests/init.pp have something like:

class site {
  if $::role {
    include "site::roles::role_${::role}"
  }
}

The top scope $::role variable would be, obviously, a fact.

Logic for data and classes to include can be managed where we prefer: on site modules, Hiera, or the ENC.

The principle here is that as much data as possible, and especially the nodes' identifying variables, should come from facts.

Also, in this case common sense applies and extreme usage deviations should be avoided; in particular, a custom ruby fact should compute its output without any local data. If we start to place data inside the fact in order to return data, we are probably doing something wrong.

Nodeless site.pp

We have seen that site.pp does not necessarily need to have node definitions in its content in imported files. We don't need them when we drive everything via facts, where we manage class inclusion in Hiera, and we don't need them with an approach where conditionals based on host names are used to set the top scope variables that identify our nodes:

# nodeless site.pp

# Roles are based on hostnames
case $::hostname {
  /^web/: { $role = 'web' }
  /^puppet$/: { $role = 'puppet' }
  /^lb/: { $role = 'lb' }
  /^log/: { $role = 'log' }
  /^db/: { $role = 'db' }
  /^el/: { $role = 'el' }
  /^monitor/: { $role = 'monitor' }
  default: { $role = 'default' }
}

# Env is based on hostname or (sub) domain
if 'devel' in $::fqdn { $env = 'devel' }
elsif 'test' in $::fqdn { $env = 'test' }
elsif 'qa' in $::fqdn { $env = 'qa' }
else { $env = 'prod' }

include site
# hiera_include('classes')

Here, the $role and $env variables are set at top scope according to hostnames that would benefit from a naming standard we can parse with Puppet code.

At the end, we just include our site class or use hiera_include to manage the grouping logic for what classes to include in our nodes.

Such an approach makes sense only where we don't have to manage many different hostnames or roles, and where the names of our nodes follow a naming pattern that lets us derive identifying variables.

Note

Note that the $::hostname or $clientcert variables might be forged and may return non-trustable values. Since Puppet 3.4, if in puppet.conf we set trusted_node_data = true, we have at our disposal the special variable $trusted['certname'] to identify a verified hostname.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.165.247