Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Nagios – monitoring and notification

Nagios (http://www.nagios.org) is an open source monitoring and notification utility. It enables users to monitor various resources, such as CPU, memory, disk usage, network status, reachability, HTTP status, testing web page rendering, and various checks using Nagios-compatible sensors. There is a giant list of Nagios plugins that covers the monitoring of almost all popular services and software. The best thing with Nagios is its plugin architecture. You can write a simple plugin for custom resource monitoring. So, effectively, anything where its state can be measured, can be monitored via Nagios. This section will discuss, very briefly, Nagios setup and how it can be enabled to monitor system resources and Cassandra.

Installing Nagios

Nagios ships in different packages, such as DIY, student, professional, and business, based on a number of features and support; you may visit the Nagios website and choose one based on your needs. With the number of free plugins, the Nagios free version is generally a good option. In this section, we will see how to install and configure the Nagios free version (from the source) on a CentOS machine. These instructions should work on any RHEL variant. For Ubuntu- or Debian-like environments, you may need to look for an apt-get equivalent of the yum commands in the script. Based on your Linux distribution, the Nagios distribution can be installed from additional repositories. It may or may not be the latest and greatest among Nagios, but it eases a lot of installation hassles. We use tarball installation for this book to keep things generic.

Prerequisites

The Nagios server (PHP-based) has some dependencies to be fulfilled before you can start installing it.

PHP: You will need to have a PHP processor to run Nagios. Check its availability using the following command:

$ php -v 
PHP 5.3.26 (cli) (built: Jun 24 2013 18:08:10) 
Copyright (c) 1997-2013 The PHP Group 
Zend Engine v2.3.0, Copyright (c) 1998-2013 Zend Technologies

If PHP does not exist, install it.

$ sudo yum install php

httpd: The Apache httpd web server serves as the frontend to a PHP-based Nagios web application. To check whether you have httpd or not, execute the following command:
```
$ httpd -v 
Server version: Apache/2.2.24 (Unix) 
Server built:   May 20 2013 21:12:45
```
If httpd does not exist, install it.
```
$ sudo yum install httpd
```

GCC compiler: Check for the installed version of GCC compiler using the following command:

$ gcc -v 
Using built-in specs. 
COLLECT_GCC=gcc 
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-amazon-linux/4.6.3/lto-wrapper 
Target: x86_64-amazon-linux 
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,,fortran,ada,go,lto --enable-plugin --disable-libgcj --with-tune=generic --with-arch_32=i686 --build=x86_64-amazon-linux 
Thread model: posix 
gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC)

Install it, if it does not exist:

$ sudo yum install gcc glibc glibc-common

GD graphics library: GD is a dynamic graphics development library to generate various formats of dynamically generated images. Unfortunately, there is no quick way to see GD installation. To install GD Library, execute the following command:
```
$ yum install gd gd-devel
```

Preparation

Before we jump into installing Nagios, we need to set up a user account and a group for Nagios.

$ sudo -i 
$ useradd -m nagios 
$ passwd nagios 
$ groupadd nagcmd 
$ usermod -a -G nagcmd nagios 
$ usermod -a -G nagcmd apache

Installation

Nagios installation can be divided into four parts: installing Nagios, configuring Apache httpd, installing plugins, and setting up Nagios as a service.

Installing Nagios

The following are the steps to install Nagios from tarball:

Download tarball from the Nagios download page and untar it:

$ wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.5.0.tar.gz 
$ tar xzf nagios-3.5.0.tar.gz

Install Nagios from the source:

$ cd nagios
$ ./configure –with-command-group=nagcmd 
$ make all 
$ sudo make install 
   install-base 
   install-cgis 
   install-html 
   install-exfoliation 
   install-config 
   install-init 
   install-commandmode 
   fullinstall

Nagios is installed now. Update the contact details before you move to the next step:

$ sudo vi /usr/local/nagios/etc/objects/contacts.cfg

define contact{ 
 contact_name nagiosadmin     ; Short name of user 
 use          generic-contact ; Inherit default values
 alias        Nagios Admin    ; Full name of user 
 email        YOUR_EMAIL_ID   ; *SET EMAIL ADDRESS* 
}

Configuring Apache httpd

Set Apache httpd with the appropriate Nagios configuration:

$ sudo make install-webconf 
/usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf 
*** Nagios/Apache conf file installed ***

Set the password for the Nagios web console for the user nagiosadmin:

$ sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Restart Apache httpd:
```
$ sudo service httpd restart
```

Installing Nagios plugins

Download and untar Nagios plugins from the Nagios website's plugins page, http://www.nagios.org/download/plugins/ using the following commands:
```
$ wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.16.tar.gz 
$ tar xzf nagios-plugins-1.4.16.tar.gz 
```
Install the plugin:
```
$ cd nagios-plugins-1.4.16 
$ ./configure --with-nagios-user=nagios –with-nagios-group=nagios 
$ make 
$ make install
```
Note
Warning! If you get an error such as check_http.c:312:9: error: 'ssl_version' undeclared (first use in this function) while trying to execute ./configure or make, your system probably lacks the libssl library. To resolve this issue, execute the following commands:
On RHEL- or CentOS-like systems:
yum install openssl-devel -y
On Debian- or Ubuntu-like systems:
sudo apt-get install libssl-dev
Re-run ./configure, then make clean, then make.

Setting up Nagios as a service

Everything is set; let's set Nagios as service:

$ sudo chkconfig --add nagios 
$ sudo chkconfig nagios on

Check if the default configuration is good to go and start the Nagios service:

# Check configuration file
$ sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg 
[-- snip --] 
Website: http://www.nagios.org 
Reading configuration data... 
   Read main config file okay... 
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'... 
[-- snip --] 
Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'... 
   Read object config files okay... 
Running pre-flight check on configuration data... 
[-- snip --] 
Total Warnings: 0 
Total Errors:   0 
Things look okay - No serious problems were detected during the pre-flight check

# Start Nagios as a service
$ sudo service nagios start

Now you are ready to see the Nagios web console. Open the http://NAGIOS_HOST_ADDRESS/nagios URL in your browser. You should be able to see the Nagios home page with a couple of default checks on the Nagios host.

Nagios plugins

Nagios' power comes from a lot of plugin libraries available for it. There are sufficient default plugins provided as a part of the base package to perform decent resource monitoring. For advanced or non-standard monitoring, you will have to either download it from somewhere, such as the Nagios plugins directory or GitHub, or you will have to write a plugin of your own. Writing a custom plugin is very simple. There are only two requirements: the plugin should be executable via command prompt, and the plugin should return with the following exit values:

0 implying OK state
1 implying warning state
2 implying critical state
3 implying unknown state

This means you are free to choose your programming language and tooling. As long as you follow these two specifications, your plugin can be used in Nagios.

Note

Nagios plugins directory:

http://exchange.nagios.org/directory/Plugins

Nagios plugins projects on GitHub:

https://github.com/search?q=nagios+plugin&type=Repositories&ref=searchresults

Nagios plugins for Cassandra

There are a few Cassandra-specific plugins in the Nagios plugins directory. There is a promising project on GitHub, namely, Nagios Cassandra Monitor (https://github.com/dmcnelis/NagiosCassandraMonitor); it seems a little immature, but worth evaluating. In this subsection, we will use a JMX-based plugin that is not Cassandra-specific. We will use this plugin to connect to Cassandra nodes and query heap usage. This will tell us about two things: whether or not it can connect to Cassandra (which can be treated as an indication of whether or not the Cassandra process is up) and what the heap usage is.

The following are the steps to get the JMX plugin installed. All these operations take place on the Nagios machine and not on Cassandra nodes.

Download the plugins from http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx/details.

Untar and navigate to the libexec directory:

$ tar xvzf check_jmx.tgz 
$ cd check_jmx/nagios/plugin/
$ sudo cp check_jmx jmxquery.jar /usr/local/nagios/libexec/

Assign them proper ownership and run a test:

$ cd /usr/local/nagios/libexec/
$ sudo chown nagios:nagios check_jmx jmxquery.jar

Replace 10.99.9.67 with your Cassandra node:

$ ./check_jmx -U 
service:jmx:rmi:///jndi/rmi://10.99.9.67:7199/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192 

JMX OK HeapMemoryUsage.used=1217368912{committed=1932525568;init=1953497088;max=1933574144;used=1217368912}

Executing remote plugins via an NRPE plugin

NRPE is a plugin to execute plugins on remote hosts. One may think of it as OpsCenter and its agents (see the following figure). With NRPE, Nagios can monitor remote host resources, such as memory, CPU, disk, network, and can execute any plugin on a remote machine.

Figure 7.6: Nagios with NRPE plugin in action

NRPE installation has to be done on the Nagios machine as well as all the other machines where we want to execute a Nagios plugin locally, for example, to monitor the CPU usage.

Installing NRPE on host machines

First, you need to create a nagios user and a nagios group and set the user with a password as discussed in the Preparation subsection in this chapter. After that, install the Nagios plugin as mentioned in the Installing Nagios plugins section in this chapter. Now you may proceed to the NRPE installation.

Install xinetd if it does not already exist:
```
$ sudo yum install xinetd
```

Download the NRPE daemon and plugin from the NRPE Nagios page at http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details and install them:

# Download and untar NRPE
$ wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe-2.14.tar.gz 
$ tar xvzf nrpe-2.14.tar.gz 

# make and install daemon and plugin, configure xinetd
$ cd nrpe-2.14 
$ ./configure 
$ make all 
$ sudo make install-plugin 
$ sudo make install-daemon 
$ sudo make install-daemon-config 
$ make install-xinetd

After this, you need to make sure each host machine accepts requests coming from Nagios. For this, you need to edit /etc/xinetd.d/nrpe to add the Nagios host address to it. In the following code snippet below, you need to replace NAGIOS_HOST_ADDRESS with the actual Nagios host address:
```
# edit /etc/xinetd.d/nrpe 
only_from = 127.0.0.1 NAGIOS_HOST_ADDRESS

# edit /etc/services append this
nrpe    5666/tcp             # NRPE
```

Restart and test if xinet is functional:

# Restart xinetd
$ sudo service xinetd restart 
Stopping xinetd:    [FAILED]
Starting xinetd:    [  OK  ] 

# Check if it's listening
$ netstat -at | grep nrpe
tcp    0    0 *:nrpe    *:*    LISTEN      
# Check NRPE plugin
$ /usr/local/nagios/libexec/check_nrpe -H localhost 
NRPE v2.14 

# Try to invoke a plugin via NRPE
$ /usr/local/nagios/libexec/check_nrpe -H localhost -c check_load 

OK - load average: 0.01, 0.04, 0.06|load1=0.010;15.000;30.000;0; load5=0.040;10.000;25.000;0; load15=0.060;5.000;20.000;0;

Now we have the machine ready to be monitored via NRPE.

Installing NRPE plugin on a Nagios machine

Installing NRPE plugin on a Nagios machine is a subset of the task that we did for the remote host machine. All you need to do is install the NRPE plugin and nothing else. The following are the steps:

$ wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe-2.14.tar.gz 
$ tar xvzf nrpe-2.14.tar.gz 
$ cd nrpe-2.14 
$ ./configure 
$ make all 
$ sudo make install-plugin 

# Test if plugin is working, you should replace 10.99.9.67
# with one of the machine's address with NRPE + xinetd
$ /usr/local/nagios/libexec/check_nrpe -H 10.99.9.67 
NRPE v2.14

Setting things up to monitor

In this section, we will talk about how to set up CPU, disk, and Cassandra monitoring. However, the detail is enough to enable you to set up any Nagios plugin and configure monitoring.

Monitoring CPU and disk space: These are the tests that need to be executed on remote machines. So, we may need to configure NRPE configuration to allow those plugins to be executed remotely. This configuration is stored in /usr/local/nagios/etc/nrpe.cfg. If you do not find the plugin that you wanted to execute or you want to change the parameters to be passed to the plugin, this is the place; to achieve that, use the following set of commands:

# edit /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 

command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 

command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
[-- snip --] 

#custom commands *add your commands here*

# EC2 ephemeral storage root disk 
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1

Have a look at the following screenshot:

Figure 7.7: Nagios interface monitoring local and remote resources

As you can see, we have a CPU check (check_load) and a disk check already provided by the default configuration. However, if I wanted to monitor the /dev/sda1 device for space availability, I would add a new check, check_sda1, for this.

Setting up a JMX monitor: For Cassandra, we want to check the JVM heap usage via JMX. Since this executes on the local machine (Nagios) to connect to the JMX service on the remote machine, we do not need to use NRPE for this. So, we have nothing to do here.

Updating configuration : The best part of Nagios is its configuration. With a little trick and grouping, you can make a fine configuration that can scale to hundreds of machines. All configurations in Nagios are text-based with JSON-ish syntax. You can have files organized in whichever way you want and let Nagios know where the files are. For this particular case, the /usr/local/nagios/etc/objects/cassandrahosts.cfg file is created. This file houses all the information related to monitoring. The following code is what it looks like (see the comments in bold):

# A machine to be monitored
# DEFINE ALL CASSANDRA HOSTS HERE 

define host{ 
        use                     linux-server 
        host_name               cassandra1 
        alias                   Cassandra Machine 
        address                 10.99.9.67 
        } 

# create logical groupings, manageable, saves typing
# HOST GROUP TO COLLECTIVELY CALL ALL CASSANDRA HOSTS 

define hostgroup{ 
        hostgroup_name  cassandra_grp 
        alias           Cassandra Group 
        members         cassandra1  ;this is CSV of 
                                    ;hosts defined above 
        } 

# A service defines what command to execute on what hosts
# MONITORING SERVICES 

# A service that executes locally
#Check Cassandra on remote machines 

define service{ 
        use                     generic-service 
        hostgroup_name          cassandra_grp 
        service_description     Cassandra 
        check_command           check_cas ;defined below 
        } 

# A service that gets executed remotely via NRPE
# check disk space status 
define service{ 
  use                 generic-service 
  hostgroup_name      cassandra_grp 
  service_description check disk 
  check_command       check_nrpe!check_sda1 
  } 

# check CPU status 
define service{ 
  use                 generic-service 
  hostgroup_name      cassandra_grp 
  service_description check CPU 
  check_command       check_nrpe!check_load 
  } 

# A command is a template of a command line call, here:
#   $USER1$ is plugin directory, nagios/libexec
#   $HOSTADRRESS$ resolves to the address defined in 
#   host block above, hosts are chosen from the service that
#   calls this command

# define custom commands 
# check JVM heap usage using JMX, 
# warn if > 3.7G, mark critical if > 3.85G 

define command { 
        command_name check_cas 
        command_line $USER1$/check_jmx -U service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:7199/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 3700000000 -c 3850000000 
        }

Letting Nagios know about the new configuration: We have created a new configuration file that Nagios does not know about. We need to register it in /usr/local/nagios/etc/nagios.cfg; append the following line to this file:

#custom file *ADD YOUR FILES HERE* 
cfg_file=/usr/local/nagios/etc/objects/cassandrahosts.cfg

Test the configuration and you are done.

$ sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 3.5.0 
[-- snip --] 
Reading configuration data... 
   Read main config file okay... 
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'... 
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'... 
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'... 
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'... 
Processing object config file '/usr/local/nagios/etc/objects/cassandrahosts.cfg'... 
Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'... 
   Read object config files okay... 
Running pre-flight check on configuration data... 
[-- snip --] 
 Total Warnings: 0 
Total Errors:   0 
Things look okay - No serious problems were detected during the pre-flight check

Restart Nagios by executing sudo service nagios restart.

Monitoring and notification using Nagios

Nagios has built-in support to send mails whenever an interesting event, such as a warning, an error, or a service coming back to the OK state, occurs. By default, it uses the mail command, so if your mail is configured correctly, you should see mails when you execute the following command:

# substitute YOUR_EMAIL_ADDRESS with your email id.
/usr/bin/printf "%b" "Hi Nishant, 
this is Nagios." | /bin/mail -s "Nagios test mail" YOUR_EMAIL_ADDRESS

If this does not reach your mail box or the spam folder, you should check your configuration. If you do not have the mail utility installed already, execute the following command:

# mail utility on RHEL like OS
$ sudo yum install mailx

# On Ubuntu or Debian derivatives
$ sudo apt-get install mailutils

If you are not happy with the mailing option or want to change the mailer to send mail via a specific mail provider like Gmail, you should dig into the plugins directory or GitHub to find appropriate alternatives.

Nagios provides a pretty intuitive GUI—a web-based console that immediately highlights anything that is wrong with any service or host. Apart from displaying the immediate state, Nagios also stores the history of monitored events. There are many reporting capabilities that provide a complete infrastructure status overview. One can easily generate a histogram that states the performance of a service. Refer to the following diagram:

Monitoring and notification using Nagios

Figure 7.8: An auto generated histogram report from Nagios

There are many reporting options; options to disable the alerts during a scheduled downtime of infrastructure. It may be worth playing around the Nagios GUI to learn about the various options.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Nagios – monitoring and notification

Create new playlist

Sign In

Sign Up

Nagios – monitoring and notification

Installing Nagios

Prerequisites

Preparation

Installation

Installing Nagios

Configuring Apache httpd

Installing Nagios plugins

Note

Setting up Nagios as a service

Nagios plugins

Note

Nagios plugins for Cassandra

Executing remote plugins via an NRPE plugin

Installing NRPE on host machines

Installing NRPE plugin on a Nagios machine

Setting things up to monitor

Monitoring and notification using Nagios

Table of Contents for
Nagios – monitoring and notification