Writing a new plugin from scratch

Given the very useful standard plugins in the Nagios Plugins set and the large number of custom plugins available on Nagios Exchange, occasionally, as our monitoring setup grows more refined, we may well find that there is a service or property of a host that we would like to check. However, for this, there doesn't seem to be any suitable plugin available. Every network is different and, sometimes, the plugins that others have made by generously donating their time to the community don't quite cover all your bases. Generally, the more specific your monitoring requirements get, the less likely it is that there's a plugin available that does exactly what you need.

In this example, we'll deal with a very particular problem that we'll assume can't be dealt with effectively by any known Nagios Core plugins, and we'll write one ourselves using Perl. Here's the example problem.

Our Linux security team wants to be able to automatically check whether any of our servers are running kernels that have known exploits. However, they're not worried about every vulnerable kernel, only certain ones. They have provided us with the version numbers of three kernels, which have small vulnerabilities that they're not particularly worried about but that do need patching, and one they're extremely worried about.

Let's say, the minor vulnerabilities are in the kernels with version numbers 2.6.19, 2.6.24, and 3.0.1. The serious vulnerability is in the kernel with version number 2.6.39. Note that the version numbers in this case are arbitrary and don't necessarily reflect any real kernel vulnerabilities!

The team could log in to all the servers individually to check them, but the servers are of varying ages and access methods and are managed by different people. They would also have to manually check the kernel version more than once because it's possible that a naive administrator could upgrade the system to a kernel that's known to be vulnerable in an older release, and they also might want to add other vulnerable kernel numbers to be checked later on.

So, the team has asked us to solve the problem with Nagios Core monitoring and we've decided that the best way to do this is to write our own plugin, check_vuln_kernel, which checks the output of uname(1) for a kernel version string and then does the following:

  • If the system is using one of the slightly vulnerable kernels, the plugin will return a WARNING state so that we can let the security team know that they should address it when they're next able to.
  • If it's the highly vulnerable kernel version, it will return a CRITICAL state so that the security team knows that a patched kernel needs to be installed immediately.
  • If uname(1) gives an error or output that we don't understand, it will return an UNKNOWN state, alerting the team to a bug in the plugin or possibly more serious problems with the server.
  • Otherwise, it returns an OK state, confirming that the kernel is not known to be a vulnerable one.
  • Finally, they want to be able to see at a glance in the Nagios Core monitoring what the kernel version is and whether it's vulnerable or not.

For the purposes of this example, we'll only monitor the Nagios Core server itself, but, via NRPE, we'd be able to install this plugin on the other servers that require this monitoring, where they'll work just as well. You should refer to the Monitor local services on a remote machine with NRPE recipe in Chapter 6, Enabling Remote Execution to learn how to do this.

While this problem is very specific, we'll approach it in a very general way, which you'll be able to adapt to any solution, where it's required for a Nagios plugin to do the following tasks:

  • Run a command and pull its output into a variable
  • Check the output for the presence or absence of certain patterns
  • Return an appropriate status based on those tests

All this means is that if you're able to do this, you'll be able to monitor anything from Nagios Core effectively!

Getting ready

You should have a Nagios Core 4.0 or newer server running with a few hosts and services configured already. You should also already be familiar with the relationship between services, commands, and plugins.

You should have Perl installed, at least version 5.10. This will include the required POSIX module. You should also have the Perl modules Nagios::Plugin (or Monitoring::Plugin) and Readonly installed. On Debian-like systems, you can install this using the following command:

# apt-get install libnagios-plugin-perl libreadonly-perl

On RPM-based systems, such as CentOS or Fedora Core, this should work:

# yum install perl-Nagios-Plugin perl-Readonly

This will be a rather long recipe that ties in a lot of Nagios Core concepts. For this, you should be familiar with all the following concepts:

  • Defining new hosts and services and how they relate to one another
  • Defining new commands and how they relate to the plugins they call
  • Installing, testing, and using Nagios Core plugins

Some familiarity with Perl would also be helpful, but not required. We'll include comments to explain what each block of code is doing in the plugin.

How to do it...

We can write, test, and implement our example plugin as follows:

  1. Change to the directory containing the executable plugin programs for Nagios Core. The default location is /usr/local/nagios/libexec:
    # cd /usr/local/nagios/libexec
    
  2. Start editing a new file called check_vuln_kernel:
    # vi check_vuln_kernel
    
  3. Include the following code in it. Take note of the comments, which explain what each block of code is doing:
    #!/usr/bin/env perl
    
    # Use strict Perl style
    use strict;
    use warnings;
    use utf8;
    
    # Require at least Perl v5.10
    use 5.010;
    
    # Require a few modules, including Nagios::Plugin
    use Nagios::Plugin;
    use POSIX;
    use Readonly;
    
    # Declare some constants with patterns that match bad kernels
    Readonly::Scalar my $CRITICAL_REGEX => qr/^2[.]6[.]39[^d]/msx;
    Readonly::Scalar my $WARNING_REGEX =>
      qr/^(?:2[.]6[.](?:19|24)|3[.]0[.]1)[^d]/msx;
    
    # Run POSIX::uname() to get the kernel version string
    my @uname   = uname();
    my $version = $uname[2];
    
    # Create a new Nagios::Plugin object
    my $np = Nagios::Plugin->new();
    
    # If we couldn't get the version, bail out with UNKNOWN
    if ( !$version ) {
        $np->nagios_die('Could not read kernel version string');
    }
    
    # Exit with CRITICAL if the version string matches the critical pattern
    if ( $version =~ $CRITICAL_REGEX ) {
        $np->nagios_exit( CRITICAL, $version );
    }
    
    # Exit with WARNING if the version string matches the warning pattern
    if ( $version =~ $WARNING_REGEX ) {
        $np->nagios_exit( WARNING, $version );
    }
    
    # Exit with OK if neither of the patterns matched
    $np->nagios_exit( OK, $version );
  4. Make the plugin that is owned by the nagios group and executable with chmod(1):
    # chown root.nagios check_vuln_kernel
    # chmod 0770 check_vuln_kernel
    
  5. Run the plugin directly to test it:
    # sudo -s -u nagios
    $ ./check_vuln_kernel
    VULN_KERNEL OK: 3.16.0-4-amd64
    

We should now be able to use the plugin in a command and hence in a service check, just like any other command.

How it works...

The code we added in the preceding new plugin file check_vuln_kernel can be understood in five general steps:

  1. It runs Perl's POSIX uname implementation to get the version number of the kernel.
  2. If that doesn't work, it exits with a UNKNOWN status.
  3. If the version number matches anything in a pattern containing critical version numbers, it exits with a CRITICAL status.
  4. If the version number matches anything in a pattern containing warning version numbers, it exits with a WARNING status.
  5. Otherwise, it exits with an OK status.

It also prints the status as a string, along with the kernel version number, if it was able to retrieve one.

We might set up a command definition for this plugin as follows:

define command {
    command_name  check_vuln_kernel
    command_line  $USER1$/check_vuln_kernel
}

In turn, we might set up a service definition for this command like this:

define service {
    use                  local-service
    host_name            localhost
    service_description  VULN_KERNEL
    check_command        check_vuln_kernel
}

If the kernel was not vulnerable, the service's appearance in the web interface might be something like this:

How it works...

However, if the monitoring server itself happened to be running a vulnerable kernel, it might look more like this (and send consequent notifications, if configured to do so):

How it works...

There's more...

This may be a simple plugin, but its structure can be generalized to all sorts of monitoring tasks. If we can figure out the correct logic to return the status we want in an appropriate programming language, then we can write a plugin to do basically anything.

A plugin like this could just as effectively be written in C for improved performance, but we'll assume, for simplicity's sake, that high performance for the plugin is not required and instead we can use a language that's better suited for quick ad hoc scripts like this one; in this case, Perl. The utils.sh file, also in /usr/local/nagios/libexec, allows us to write in the shell script if we'd prefer that.

If you prefer Python, the nagiosplugin library should meet your needs for both Python 2 and Python 3. Ruby users may like the nagiosplugin gem.

If you write a plugin that you think could be generally useful for the Nagios community at large, please consider putting it under a free software license and submitting it to Nagios Exchange so that others can benefit from your work. Community contribution and support is what has made Nagios Core such a great monitoring platform with such wide use.

Any plugin you publish in this way should conform to the Nagios Plugin Development Guidelines. At the time of writing this, these are available at https://nagios-plugins.org/doc/guidelines.html.

You may find older Nagios Core plugins written in Perl using the utils.pm file instead of Nagios::Plugin or Monitoring::Plugin. This will work fine, but Nagios::Plugin is recommended, as it includes more functionality out of the box and tends to be easier to use.

See also

  • Monitoring local services on a remote machine with NRPE, Chapter 6, Enabling Remote Execution
  • The Creating a new command section in this chapter
  • Creating a new service, Chapter 1, Understanding Hosts, Services, and Contacts
  • The Customizing an existing command section in this chapter
  • The Implementing threshold checks in a plugin section in this chapter
  • The Using macros as environment variables in a plugin section in this chapter
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.166.7