Chapter 5. Maintenance

Now that you have MCollective working, and have an idea of how powerful MCollective can be, let’s go over some of the steps involved in maintaining and debugging MCollective.

Keeping Sessions Alive

If you have a firewall or flow-tracking switch (e.g. Juniper) between your servers and your middleware, you may need to tweak the settings to ensure the connections remain open.

MCollective’s STOMP sessions are idle unless a client is actively issuing requests. MCollective does set the keep-alive flag on the TCP session, but many operating systems send the first keep-alive packet long after most firewalls drop the session from their active table. The server will not be aware that the session has been cut. The middleware will not learn until it tries to forward a message from a client.

To keep the sessions alive, configure the server to send updated registration information on a period shorter than the time the firewall will time out the session. In most situation, every 10 minutes is more than sufficient.

# server.cfg
registerinterval = 600  # seconds

The default registration agent is AgentList which only sends a list of the installed server plugins. You can create your own registration agents to send other information, as we’ll document in Chapter 14.

Activating Changes

After any server or agent configuration change you’ll need to restart mcollectived before the changes will be visible.

$ sudo service mcollective restart
Shutting down mcollective:                                 [  OK  ]
Starting mcollective:                                      [  OK  ]

You can send the mcollectived daemon a USR1 signal to make it reload agent plugins. On most platforms you can do this with pkill. The -x option can be used to ensure you don’t kill any other program with a partial name match. The following command will cause mcollective to reload the agents, but also report back any failures from the pkill command.

$ sudo pkill -USR1 -x mcollectived || echo "pkill failed: $?"

Note that this won’t report back any failures from mcollective. For that purpose you’d have to read the log files:

$ tail -20 /var/log/mcollective.log

Server Statistics

In addition to the list of agents available on a server, MCollective also reports back a fair number of statistics from the inventory request.

$ mco inventory heliotrope
Inventory for heliotrope:

   Server Statistics:
                      Version: 2.5.0
                   Start Time: Mon Apr 14 23:27:32 -0700 2014
                  Config File: /etc/mcollective/server.cfg
                  Collectives: mcollective
              Main Collective: mcollective
                   Process ID: 29427
               Total Messages: 5
      Messages Passed Filters: 5
            Messages Filtered: 0
             Expired Messages: 0
                 Replies Sent: 4
         Total Processor Time: 2.66 seconds
                  System Time: 3.65 seconds

   Agents:
      discovery       filemgr         nettest
      package         puppet          rpcutil
      service

...several hundred other lines of output

As the output of inventory is very verbose I rather like using awk to stop after the first blank line.

$ mco inventory heliotrope | awk '/Server/','/^$/'

Logging

The following are defaults for logging used if not override in the server.cfg file.

logger_type = file
loglevel = info
logfile = /var/log/mcollective.log
keeplogs = 5
max_log_size = 2097152
logfacility = user

In this configuration mcollectived writes its own logs to disk, and does its own log rotation. It keeps five logs on disk, and rotates when each log reaches 2 MB.

This may work for many underutilized hardware systems, but may be non-optimal in many situations where storage is expensive or the systems are virtualized. Personally I prefer to utilize the existing logging and analysis infrastructure, and recommend the following settings:

logger_type = syslog
loglevel = debug
logfacility = daemon

These settings are documented in detail at http://docs.puppetlabs.com/mcollective/configure/server.html#logging and http://docs.puppetlabs.com/mcollective/configure/client.html#logging.

Monitoring Servers

There are two ways to monitor that MCollective servers are alive: actively, and passively.

An active check would be to issue a call to an agent available on every node, and validate the results. This could be something as simple as mco ping which is a low-level connectivity test which doesn’t require authentication or authorization. Or you could test to a specific plugin, e.g. a NRPE test. We provide a script to do this in Creating a Standalone Client.

A passive check would be to listen to the registration agent topic and look for servers which haven’t checked in recently. We discuss how to build a registration agent in Registration Collector. An example of how to check this with Nagios can be found at Puppet Labs wiki AgentRegistrationMonitor.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.188.138