Chapter 7. Controlling Puppet

If you are using Puppet, you will be quite happy with the level of control MCollective gives you. MCollective can allow new ways of using Puppet that simply aren’t possible from agent, cron-run, or even command-line usage of Puppet.

Install the Puppet Agent

The first thing we need to do is install the MCollective Puppet agent. Installation of this is identical to the agents we installed in Chapter 4 above. Since we know you have Puppet installed, we’ll dispense with the command line installation and show you to do it with Puppet.

node nodename {
  mcollective::plugin::agent  { 'puppet': }  # for servers
  mcollective::plugin::client { 'puppet': }  # for clients
}

If you use Hiera you can install the agent with a simple listing of the puppet agent in the mcollective::plugin::agents array. In this example we’re going to show you an example where we set the puppet agent dependencies to ensure that puppet client is installed on the host.

mcollective::plugin::agents:
  puppet:
    version: latest
    dependencies:
      - Package[%{hiera('puppet::client::package_name')}]
      - Service[%{hiera('puppet::client::service_name')}]

mcollective::plugin::clients:
  puppet:
    version: latest

Note

Obviously a bit redundant since Puppet is enforcing this policy so we already know that it is installed, but this makes for a good example since the mcollective agent for puppet can’t function without puppet installed.

Checking Puppet Status

Once you have installed the agent and restarted mcollectived (which the puppet module does for you) you can query and run puppet from any client which has the puppet client installed. The first thing you should do is confirm which systems have the MCollective Puppet agent installed.

$ mco puppet count
Total Puppet nodes: 3

          Nodes currently enabled: 3
         Nodes currently disabled: 0

Nodes currently doing puppet runs: 0
          Nodes currently stopped: 3

       Nodes with daemons started: 1
    Nodes without daemons started: 2
       Daemons started but idling: 1

$ mco puppet summary
Summary statistics for 3 nodes:

                Total resources: ▄▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 17.0
          Out Of Sync resources: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 0.0
               Failed resources: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 0.0
              Changed resources: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 0.0
Config Retrieval time (seconds): ▇▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 1.8
       Total run-time (seconds): ▇▇▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁  min: 0.0    max: 2.3
  Time since last run (seconds): ▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▇  min: 221.0  max: 2.5k

You’ll notice that these puppet runs are very fast, with fairly few resources involved. Only a few resources are used for the minimum test environment for the mcollective module provided in this book. A production setup will usually have much longer run times and thousands or tens of thousands of resources involved.

Controlling the Puppet Daemon

During maintenance you may want to disable the puppet agent on certain nodes. When you disable the agent, you can supply a message letting others know what you are doing.

$ mco puppet disable --with-identity heliotrope message="Disk replacement"

 * [ ==================================================> ] 1 / 1

Summary of Enabled:
   disabled = 1

Finished processing 1 / 1 hosts in 85.28 ms
$ mco puppet runonce --with-identity heliotrope

 * [ =================================================> ] 1 / 1

heliotrope                               Request Aborted
   Puppet is disabled: 'Disk replacement'
   Summary: Puppet is disabled: 'Disk replacement'

Finished processing 1 / 1 hosts in 84.22 ms

Re-enabling the puppet agent on the node is just as easy as disabling it.

$ mco puppet enable --with-identity heliotrope

 * [ ==================================================> ] 1 / 1

Summary of Enabled:
   enabled = 1

Finished processing 1 / 1 hosts in 84.36 ms

You can easily apply these same commands to enable or disable the puppet agent on nodes matching any filter criteria as discussed in Filters.

Invoking Ad-Hoc Puppet Runs

The MCollective puppet agent provides a powerful tool for controlling Puppet runs. If you example the help command for puppet you’ll find many familiar options for controlling puppet runs from the command line with puppet agent or puppet apply:

$ mco help puppet

Application Options
  --force                    Bypass splay options when running
  --server SERVER            Connect to a specific server or port
  --tags, --tag TAG          Restrict the run to specific tags
  --noop                     Do a noop run
  --no-noop                  Do a run with noop disabled
  --environment ENVIRONMENT  Place the node in a specific environment for this run
  --splay                    Splay the run by up to splaylimit seconds
  --no-splay                 Do a run with splay disabled
  --splaylimit SECONDS       Maximum splay time for this run if splay is set
  --ignoreschedules          Disable schedule processing
  --rerun SECONDS            When performing runall do so repeatedly
                                 with a minimum run time of SECONDS

The simplest invocation is naturally to run puppet immediately on one system.

$ mco puppet runonce --with-identity heliotrope

 * [ ==================================================> ] 1 / 1

Finished processing 1 / 1 hosts in 193.99 ms

$ mco puppet status --with-identity heliotrope

 * [ ==================================================> ] 1 / 1

   heliotrope: Currently idling; last completed run 02 seconds ago

Summary of Applying:
   false = 1

Summary of Daemon Running:
   running = 1

Summary of Enabled:
   enabled = 1

Summary of Idling:
   true = 1

Summary of Status:
   idling = 1

Finished processing 1 / 1 hosts in 86.43 ms

What if you needed to run puppet instantly on every CentOS host to fix the sudoers file? Notice in the output here that one of these hosts had puppet agent running, and the other did not. However both ran puppet when we asked them to.

$ mco puppet runonce --tags=sudo --with-fact operatingsystem=CentOS

 * [ ==================================================> ] 2 / 2

Finished processing 2 / 2 hosts in 988.26 ms

$ mco puppet status --wf operatingsystem=CentOS

 * [ ==================================================> ] 2 / 2

     geode: Currently stopped; last completed run 1 minutes 52 seconds ago
heliotrope: Currently idling; last completed run 2 minutes 21 seconds ago

Summary of Applying:
   false = 2

Summary of Daemon Running:
   stopped = 1
   running = 1

Summary of Enabled:
   enabled = 2

Summary of Idling:
    true = 1
   false = 1

Summary of Status:
   stopped = 1
    idling = 1

Finished processing 2 / 2 hosts in 42.17 ms

How about prompting puppet to update immediately on every host in your environment? If you are using only local manifests, you can trigger a run affecting thousands of hosts. In most server-based environments the puppet servers won’t be able to handle every client checking in for a fresh catalog all at the same time. Likewise, you may want to slow roll a large number of hosts to prevent too many of them being out of service for a major change.

Run puppet on all servers, just one at a time:

$ mco puppet runall 1
2014-02-10 23:14:00: Running all nodes with a concurrency of 1
2014-02-10 23:14:00: Discovering enabled Puppet nodes to manage
2014-02-10 23:14:03: Found 3 enabled nodes
2014-02-10 23:14:06: geode schedule status: Signalled the running Puppet Daemon
2014-02-10 23:14:06: 2 out of 3 hosts left to run in this iteration
2014-02-10 23:14:09: Currently 1 node applying the catalog; waiting for less than 1
2014-02-10 23:14:13: Currently 1 node applying the catalog; waiting for less than 1
2014-02-10 23:14:17: heliotrope schedule status: Signalled the running Puppet Daemon
2014-02-10 23:14:18: 1 out of 3 hosts left to run in this iteration
...etc

Run puppet on all webservers, up to five at at time:

$ mco puppet runall 5 --with-identity /^webd/

Note that runall is like batch except that instead of waiting for a sleep time, it waits for one of the puppet daemons to complete its run before it starts another. If you didn’t mind some potential overlap, you could always use the batch options instead:

$ mco puppet --batch 10 --batch-sleep 60 --tags ntp

Manipulating Puppet Resource Types

The mcollective puppet agent is so powerful that you can make arbitrary changes based on Puppet’s Resource Abstraction Layer (RAL). For example, if you wanted to ensure the httpd service was stopped on a given host, you could do the following.

$ mco puppet resource service httpd ensure=stopped --with-identity geode

 * [ ==================================================> ] 1 / 1

geode
   Changed: true
    Result: ensure changed 'running' to 'stopped'

Summary of Changed:
   Changed = 1

Finished processing 1 / 1 hosts in 630.99 ms

You can obviously limit this in all the ways specified in Filters. For example, you probably only want to do this on hosts where apache is not being managed by puppet:

$ mco puppet resource service httpd ensure=stopped --wc !apache

You could also fix the root alias on hosts:

$ mco puppet resource mailalias root recipient=[email protected]

This way lies danger.

This section documents some extremely powerful controls. Enabling the Puppet RAL allows direct, instantaneous and arbitrary access to any Puppet Resource Type it knows how to affect. Read carefully through the next section for how to protect yourself.

Restricting which Resources can be Controlled

By default no resources can be controlled from mcollective. The feature is enabled in the mcollective agent but it has an empty whitelist. Consider this feature a really powerful shotgun. The whitelist protects you and everyone who depends upon that foot you are aiming at. Be careful.

These are the default configuration options:

plugin.puppet.resource_allow_managed_resources = true
plugin.puppet.resource_type_whitelist = none

If you want to allow resource control, you would need to edit the mcollective/server.cfg file with either a whitelist or a blacklist of resources which can be controlled.

Example 7-1. Whitelist allows only specified resources
plugin.puppet.resource_type_whitelist = package,service
Example 7-2. Blacklist allows everything except specified resources
plugin.puppet.resource_type_blacklist = exec

MCollective does not allow you to mix white and black lists.

Block MCollective from Puppet Resources

By default no resource defined in the puppet catalog can be controlled from mcollective, so as to prevent mcollective from making a change against the Puppet policy. Sending alternate options for a resource in the puppet catalog is most likely to simply be overwritten the next time Puppet runs without the same options. In a worse case, well… sorry about the foot.

To allow MCollective to alter resources under Puppet’s control, enable the following setting.

plugin.puppet.resource_allow_managed_resources = true

Debugging

Here we'll go through some common errors you might encounter with MCollective and Puppet interaction.

Unable to Match Server with Class

If you are unable to match a host using the --with-class filter option, the first thing to do is get an inventory of the node with mco inventory hostname. If you find that the inventory does not list any classes for a host, then the classes.txt file that mcollectived is trying to read is not being written to by Puppet.

The classes.txt file is written out by the Puppet Agent during each run. In the [agent] section of puppet.conf is a variable classfile. This defaults to $statedir/classes.txt and $statedir defaults to $vardir/state. MCollective defaults to the same location as Puppet does on every platform.

However this variable can be overridden in both puppet.conf and mcollective/server.cfg. If you do not see Puppet classes in the output of an inventory request for a puppetized node, you should check the following two values and ensure that they match up.

$ sudo puppet apply --configprint classfile
/var/lib/puppet/state/classes.txt
$ grep classesfile /etc/mcollective/server.cfg
$ mco rpc rpcutil get_config_item item=classesfile -I heliotrope
heliotrope
   Property: classesfile
      Value: /var/lib/puppet/state/classes.txt

If the classfile from Puppet matches what is above, then mcollective doesn’t need an override in server.cfg. If any different value is found, you may want to set them explicitly to match in both files.

# /etc/puppet/puppet.conf
[agent]
  classfile = $statedir/classes.txt

# /etc/mcollective/server.cfg
classesfile = /var/lib/puppet/state/classes.txt

Unable to Match Server with Fact

If you are unable to match a host using the --with-fact filter option, the first thing to do is get an inventory of the node with >mco inventory hostname. If you find that the inventory does not list any facts for a server, then the facts.yaml file that mcollectived is trying to read is not being written to by Facter or Puppet.

For MCollective to know about Facts, there needs to be a parameter named plugin.yaml defined in mcollective’s server.cfg. The value of this parameter should be a filename that lists the server’s facts in YAML format, usually /etc/mcollective/facts.yaml.

# /etc/mcollective/server.cfg
factsource = yaml
plugin.yaml = /etc/mcollective/facts.yaml

The target for the plugin.yaml parameter could include multiple filenames separated by a colon in Unix systems, or a semi-colon for Windows servers. If the facts do not show up after restarting mcollectived, then the most likely problem is the formatting of the YAML within the file.

The most basic way to collect system facts was described in Facts. A more elegant and flexible solution which can use Puppet-generated facts or values was documented in Sharing Facts with Puppet. It doesn’t matter how you generate your system facts, as long as they are written in YAML format to the listed file.

Confirm that one of the following is configured to write out facts to this file.

  1. A cron job which generates YAML as described in Facts.
  2. A puppet module which writes out Facter facts and other variables as described in Sharing Facts with Puppet.
  3. Some other script or process you have which can generate YAML key/value pairs.

You can install Facts plugins other than YAML from the Puppet Labs Forge, GitHub, or other repositories as discussed in Finding Community Plugins. You can also build your own as documented in Creating Other Types of Plugins.

Warning

There is a plugin named mcollective-facter-facts on the Puppet Labs GitHub. This agent can be slow to run, as it invokes Facter for each evaluation. The YAML plugin used above to load facts from a YAML-format text file works much better.

Unable to Match Server by Hostname

If you are unable to match a host using the --with-identity or -I filter option, the first thing to do is confirm that mcollectived is running on the server. This is the most likely reason for a failed response by name.

The next step is to check and see what the configured identity in the server configuration might be:

$ grep identity /etc/mcollective/server.cfg
#identity=

In this situation, the identity is not hardcoded in the server configuration, so we’ll have to look elsewhere.

The default identity for the node is the output of the hostname command. If you are using Puppet, we can query Puppet for its certname, which we can use as a fact to query the node identity.

# server
$ sudo puppet apply --configprint certname
heliotrope.example.net

# client
$ mco rpc rpcutil get_config_item item=identity --wf clientcert=heliotrope.example.net

heliotrope
   Property: identity
      Value: heliotrope

Warning

No, that’s not a misprint. The configuration variable certname is provided by Puppet as Facter fact clientcert. No idea why the inconsistency, it’s just how Puppet is.

Likewise, you can use any other fact or class as previously described to locate the node. For example, there are only two CentOS hosts in my testlab.

$ mco rpc rpcutil get_config_item item=identity --wf operatingsystem=CentOS

Discovering hosts using the mc method for 2 second(s) .... 2
 * [ =================================================> ] 2 / 2

geode
   Property: identity
      Value: geode

heliotrope
   Property: identity
      Value: heliotrope

Summary of Value:
        geode = 1
   heliotrope = 1

Finished processing 2 / 2 hosts in 18.38 ms

If you want an MCollective node to think of itself with a different name, then set identity in server.cfg:

server.cfg

identity = iambatman

If you are using configuration management like any sane person, you can have the variable set from the configuration management’s knowledge of itself. For example, here’s a Puppet template fragment to ensure the MCollective node knows itself by the Puppet certificate name, rather than the output of hostname:

server.cfg.erb

identity = <%= scope.lookup('::clientcert') -%>

The most common source of node name confusion is based around the use of node names or FQDNs in the hostname of a system. For example, you can set a node’s hostname to either a simple name or you can include the domain.

$ grep HOSTNAME /etc/sysconfig/network   # RedHat location
HOSTNAME=heliotrope
$ hostname
heliotrope
$ hostname -f
heliotrope.example.net

With this setup the MCollective identity was heliotrope while the Puppet certname was heliotrope.example.net. You can resolve that mismatch by changing /etc/sysconfig/network on RedHat-derived systems, or /etc/hostname on Debian-derived systems or /etc/rc.conf on *BSD systems. Or you can leave it alone, so long as you understand the difference.

The lack of matching between Puppet and MCollective does not create any explicit problems. My test setup uses short node names (e.g. “heliotrope”) for MCollective while Puppet always uses the FQDN of the host.

Absolutely nothing breaks out of the box by having different identities in Puppet and MCollective, it only affects how you might write your custom plugins. In the author’s personal opinion if you have many hosts with entirely unique hostnames across your DNS domains, you can save a lot of typing by leaving the domain name off of the hostname. Other people have different opinions drawn from their experiences. Your Mileage May Vary (YMMV).

Some servers ignore messages

Many of the weirder problems observed on the mailing list end up being due to the clients and servers having a different idea of what time it is. Before you take any other debugging steps, ensure that your systems have a consistent view of the time.

client$ date +%s
1396310400
server$ date +%s
1396310402

Allowing for the difference in time taken for you to run the commands on these two systems, they should be within a few seconds. In modern NTP time-sync 1/100th of a second is a considerable gap, so most systems should be easily within the same second.

The reason this is important is due to how messages are constructed. Every MCollective message sent out contains the current timestamp, and a ttl to indicate how long the message is valid.

{
  :msgtime     => 1396310400,
  :ttl         => 60,
  :filter    => {
                   "fact"   => [{:fact=>"operatingsystem", :value=>"Debian"}],
                   "agent"  => ["package"]
                 },
 :senderid    => "desktop.example.net",
 :msgtarget   => "/topic/mcollective.discovery.command",
 :agent:      => 'discovery',
 :collective' => 'mcollective',
 :body        => body,
 :hash        => "88dd360f13614b7db83616ba49deb130",
 :requestid   => "70141ca8a465954706a51ef6a7d4914e"
}

In the situation described by this packet, the request is valid from 1396310400 until 1396310460. If your server receives a request from a client too far in the past, then the request will be ignored because the TTL has already expired. Even weirder problems can occur with clients in the future, from the server’s perspective. It is absolutely essential that all of the systems in the collective have a consistent view of the time.

Note

We aren’t talking about Timezones here. Computers track time in UTC time, and display it to you in the timezone-offset you have configured in your preferences. To computers, all time is stored and compare in UTC time as represented above. The commands above show you the UTC epoch time, or seconds since January 1st, 1970 UTC.

If you know how to translate that number back to Pacific Standard Time, then you’ll know the exact minute I wrote this particular chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.37.250