Chapter 8.  Notifications and Events

We already know how notifications work in Nagios. The previous chapters described how Nagios sends information to the users when a problem occurs and how it decides to notify people. Previously, our examples were limited to sending an e-mail 24 hours a day or only during working days.

In this chapter, we will cover the following items:

  • Creating effective notifications
  • Understanding escalations
  • Sending commands to Nagios
  • Creating event handlers
  • Using adaptive monitoring

Creating effective notifications

This section covers notifications in more depth and describes the details of how Nagios can tell other people about what is happening. We will discuss both simple as well as more complex approaches on how notifications can make your life easier.

In many cases, a plain e-mail notification about a problem may not always be the right thing to do. As people's inboxes get cluttered with e-mails, the usual approach is to create rules to move certain messages that they don't even look at to separate folders. There's a pretty good chance that if people start getting a lot of notifications that they won't need to react to, they'll simply ask their favorite mailing program to move these messages into a do not look in here unless you have plenty of time folder. Moreover, in such cases, if there is an issue they should be handling, they will most probably not even see the notification e-mail.

This section talks about the things that can be implemented in your company to make notifications more convenient for the IT staff. Limiting the amount of irrelevant information sent to various people tends to decrease their response time, as they will have much less information to filter out.

Using multiple notifications

The first issue that many Nagios administrators overlook is the ability to create more than one notification command. In this way, Nagios can try to notify you on both instant messaging (such as HipChat, Slack, Jabber, Twitter, or Telegram) and e-mail. It can also send you an SMS. A disadvantage is that at some point, you might end up receiving text messages at 2 AM about an outage of a machine that may well be down for the next three days and is not critical.

At this point, it's worth mentioning that there's another easy solution. The approach is to create multiple contacts for a single person. For example, you can set up different contacts when you're at work, when you're offline, and define a profile to not to disturb you too much during the night.

For example, you can set up the following contacts to handle various times of the day in a different fashion:

  • jdoe-workhours would be a contact that will only receive notifications during working hours; notifications will be carried out using both the corporate IM system and an e-mail
  • jdoe-daytime would be a contact that will only receive notifications between 7 AM and 10 PM, excluding working hours; notifications will be sent as a text or a pager message, and an e-mail
  • jdoe-night would be a contact that will only receive notifications between 10 PM and 7 AM; notifications will only be sent out as an e-mail

All entries would also contain contact groups pointing to the same groups that the single jdoe contact entry used to contain. This way, the other objects such as hosts, services, or contact groups related to this user would not be affected. All entries would also reside in the same file, for example, contacts/jdoe.cfg.

The main drawback of this approach is that logging on to the web interface would require using one of the users above or keeping the jdoe contact without any notifications, just to be able to log on to the interface.

The example above combined both the creation of multiple contacts and use of multiple notification commands to achieve a convenient way of getting notified about a problem. Using only multiple contacts also works fine. Another approach to the problem is to define different contacts for different ways of being notified—for example, jdoe-e-mail, jdoe-sms, and jdoe-jabber. This way, you can define different contact methods for various time periods—instant messages during working hours, text messages while on duty, and an e-mail when not at work.

Another important issue is to make sure that few people as possible are notified of the problem. Imagine there is a host without an explicit administrator assigned to it. A notification about a problem gets sent out to 20 different people. In such a case, either each of them will assume that someone else will resolve the problem, or people will run into a communication problem over discussing who will actually handle it.

Teams that cooperate tightly with each other usually solve these issues naturally—knowledgeable people start discussing a solution and a natural person to solve the issue comes out of the discussion. However, the teams that are distributed across various locations or that have problems with efficient communication within their team will run into problems in such cases.

This is why it is a good idea to either nominate a coordinator who will assign tasks as they arise or try to maintain a short list of people responsible for each machine. If you need to make sure that other people will investigate the problem if the original owner of the machine cannot do it immediately, then it is a good idea to use escalations for this purpose. These are described later in this chapter.

Previously, we mentioned that notifications only via e-mail may not always be the best thing to do. For example, they don't work well for situations that require fast response times. There are various reasons behind this. Firstly, e-mails are slow—even though the e-mail lands on your mail server in a few seconds, people usually only poll their e-mails every few minutes. Secondly, people tend to filter e-mails and skip those that they are not interested in.

Another good reason why e-mails should not always be used is that they stay on your e-mail account until you actually fetch and read them. If you have been on a 2-week vacation and a problem has occurred, should you still be worried when you read it after you get back? Has the issue been resolved already?

If your team needs to react to problems promptly, using e-mail as the basic notification method is definitely not the best choice. Let's consider what other possibilities exist to notify users of a problem effectively.

As already mentioned, a very good choice is to use instant messaging or Short Message Service (SMS) messages as the basic means of notification, and only use e-mail as a last resort. Some companies might also use the client/server approach to notify the users of the problems, perhaps integrated with showing Nagios status only for particular hosts and services. Nagios Exchange has plenty of available solutions you can use for handling notifications effectively. Visit http://exchange.nagios.org/ for more details.

Sending instant messages via Jabber

The first and the most powerful option is to use Jabber (http://www.jabber.org/) for notifications. There is an existing script for this that is available in the contributions repository on the Nagios project website (directly available at http://nagios.sf.net/download/contrib/notifications/notify_via_jabber).

This is a small Perl script that sends messages over Jabber. You may need to install additional system packages to handle Jabber connectivity from Perl.

On Ubuntu, this requires running the following command:

root@ubuntu1:~# apt-get install libnet-jabber-perl 

When using Central Perl Archive Network (CPAN), which is the source for Perl modules and documentation; visit http://www.cpan.org/) to install Perl packages, run the following command:

root@ubuntu1:~# cpan install Net::Jabber 

In order to use the notification plugin, you will need to customize the script—change the SERVER, PORT, USER, and PASSWORD parameters to an existing account. Our recommendation is to create a separate account to use only for Nagios notifications—you will need to set up authorization for each user that you want to send notifications to.

After modifying the script, it can be used for notifications as follows:

define command{ 
        command_name    notify-host-by-jabber 
        command_line    /path/to/notify_via_jabber $_CONTACTJABBERID$ "Nagios Host Notification Type: $NOTIFICATIONTYPE$ Host: $HOSTNAME$
State: $HOSTSTATE$ Address: $HOSTADDRESS$ Info: $HOSTOUTPUT$ " 
        } 
 
define command{ 
        command_name    notify-service-by-jabber 
        command_line    /path/to/notify_via_jabber $_CONTACTJABBERID$ "Nagios Service Notification Type: $NOTIFICATIONTYPE$ Service: $SERVICEDESC$ Host: $HOSTALIAS$ Address: $HOSTADDRESS$ State: $SERVICESTATE$ Additional Info: $SERVICEOUTPUT$" 
        } 

The commands above can be used for host and service notifications and will send a descriptive message using Jabber to the specified user. The $_CONTACTJABBERID$ text will be replaced with the current contact's _JABBERID custom variable.

Please note that due to how Jabber works, the best approach is for the notify_via_jabber script to use the same Jabber server as the client for receiving notifications.

As you plan to monitor servers and potentially even outgoing Internet connectivity, it would not be wise to use public Jabber servers for reporting errors. Therefore, it would be a good idea to set up a private Jabber server, probably on the same host on which the Nagios monitoring system is running.

There are multiple desktop clients for the Jabber protocol that can be used to receive Nagios notifications in a convenient way. Pidgin, available from http://www.pidgin.im/ is a cross-platform instant messaging client with multiple protocol support and includes support for Jabber.

Notifying users with text messages

There are also very useful packages for sending SMS (the text messages in mobile phones). There are multiple interfaces for sending SMS information over the Internet—such as http://www.twilio.com/, which offers a service to send SMS to phones in a large number of countries.

Using Twilio to send notifications from Nagios is straightforward. Download the twilio-sms command line from https://www.twilio.com/labs/bash/sms. It also requires creating a configuration file that specifies account information for Twilio. For an installation performed according to the steps given in Chapter 2, Installing Nagios 4,the location for the file is /opt/nagios/.twiliorc.

Next, create a Nagios command that uses the twilio-sms command directly—such as:

define command{ 
        command_name    notify-host-by-twilio 
        command_line    echo "Nagios $NOTIFICATIONTYPE$ Host: $HOSTNAME$ State: $HOSTSTATE$" | /path/to/twilio-sms $_CONTACTSMSNUMBER$ 
        } 
define command{ 
        command_name    notify-service-by-twilio 
        command_line    echo "Nagios $NOTIFICATIONTYPE$ Svc: $SERVICEDESC$ Host: $HOSTALIAS$ State: $SERVICESTATE$" | /path/to/twilio-sms $_CONTACTSMSNUMBER$ 
        } 

The downside of using Internet-based notification services is that if Internet connectivity is down, it is not possible for Nagios to send notifications. This may be a problem for Internet providers, which need to be sure their customers are online all the time.

Another possibility for sending notifications is to use GSM terminals or USB modems that offer a convenient way to send SMS notifications. Both GSM terminals and USB modems can be used to send text messages over regular SIM cards—which only require GSM coverage and do not require Internet access. These devices are usually connected via USB or serial port.

There are multiple tools that allow managing GSM terminals/modems—such as Gammu (http://wammu.eu/gammu/) and Gnokii (http://www.gnokii.org/).

Both are very common applications, and when setting up a GSM terminal it is best to check both for how well the specific hardware is supported and choose the program that supports this specific GSM terminal better. Depending on the exact hardware used, additional steps to set up drivers and/or configure Gammu/Gnokii may be needed—it is recommended to check with the documentation for both Gammu/Gnokii as well as the GSM terminal's documentation.

After setting up, both Gammu and Gnokii provide command line tools for sending SMS messages. The example below shows how to send messages using Gammu:

define command{ 
        command_name    notify-host-by-gammu 
        command_line    echo "Nagios $NOTIFICATIONTYPE$ Host: $HOSTNAME$ State: $HOSTSTATE$" | /path/to/gammu --sendsms TEXT $_CONTACTSMSNUMBER$ 
        } 
define command{ 
        command_name    notify-service-by-gammu 
        command_line    echo "Nagios $NOTIFICATIONTYPE$ Svc: $SERVICEDESC$ Host: $HOSTALIAS$ State: $SERVICESTATE$" | /path/to/gammu --sendsms TEXT $_CONTACTSMSNUMBER$ 
        } 

Current mobile phones also offer cheap Internet connectivity, and smart devices offer the possibility to write custom applications in Java, .NET, and many other languages including Ruby, Python, and Tcl. Therefore, you can also make a client/server application that queries the server for the status of selected hosts and services. It can even be unified with a notification command that pushes the changes down to the application immediately.

Integrating with HipChat

There are also multiple specialized tools for communication within organizations—such as HipChat, (http://www.hipchat.com/). It is a popular online service for group and direct communication within a company. The service has extensive APIs and is commonly used for sending notifications in addition to regular messaging.

HipChat offers rooms for group communications, which are often used for receiving notifications as well—such as a room for Nagios notifications, where IT staff reside and receive notifications instantly. The chat can then also be used to quickly and informally assign tasks to individual people.

There is a ready to use freely available solution for integrating Nagios with HipChat called hipsaint, which is available from https://github.com/hannseman/hipsaint.

To use it, simply download the source code and run the installation script:

$ python setup.py install 

Next, create new commands to send notifications to specific rooms:

define command { 
    command_name    notify-host-by-hipchat 
    command_line    hipsaint --token=tokenid --room=roomid --type=host --inputs="$HOSTNAME$|$LONGDATETIME$|$NOTIFICATIONTYPE$|$HOSTADDRESS$|$HOSTSTATE$|$HOSTOUTPUT$" -n 
} 
 
define command { 
    command_name    notify-service-by-hipchat 
    command_line    hipsaint --token=tokenid --room=roomid --type=service --inputs="$SERVICEDESC$|$HOSTALIAS$|$LONGDATETIME$|$NOTIFICATIONTYPE$|$HOSTADDRESS$|$SERVICESTATE$|$SERVICEOUTPUT$" -n 
} 

All of the above are ways to send notifications about host/service statuses that are more convenient than regular e-mails. Letting the IT staff know about problems (and once things are resolved) and being able to communicate to other people in your team/company is essential. Using e-mail may be a good solution in many cases; however, it is a good idea to spend some time on researching for a convenient and non-intrusive way to use for Nagios notifications.

Aside from the examples mentioned above, there are many more ready to use solutions available online. Many of them are listed on Nagios Exchange at http://exchange.nagios.org/directory/Addons/Notifications.

Slack integration

Slack (https://slack.com) is a very popular communication tool. Aside from regular chat based communication, it allows sending messages both to the channel and directly to other users.

The easiest way to post a message to Slack from an external service such as Nagios is to use the incoming webHook concept (https://api.slack.com/incoming-webhooks). Essentially, it boils down to sending an HTTP POST with specific JSON content to a WebHook URL, which can be uniquely generated for your Slack's team. Once we have Slack configured and a WebHook URL generated, we can proceed with setting up commands to send notifications:

define command{ 
        command_name    notify-host-by-slack 
        command_line    /usr/bin/curl -X POST --data-urlencode 'payload={"channel": "#general", "username": "Nagios", "text": "Nagios Host Notification Type: $NOTIFICATIONTYPE$ Host: $HOSTNAME$ State: $HOSTSTATE$ Address: $HOSTADDRESS$ Info: $HOSTOUTPUT$"}' WEBHOOK_URL 
        } 
 
define command{ 
        command_name    notify-service-by-slack 
        command_line    /usr/bin/curl -X POST --data-urlencode 'payload={"channel": "#general", "username": "Nagios", "text": "Nagios Service Notification Type: $NOTIFICATIONTYPE$ Service: $SERVICEDESC$ Host: $HOSTALIAS$ Address: $HOSTADDRESS$ State: $SERVICESTATE$ Additional Info: $SERVICEOUTPUT$"}' WEBHOOK_URL 
        } 

As we can see, it is enough to have the curl command installed, and the good news is that, most probably, it is already present in the system. Replace the WEBHOOK_URL string with the correct value, similar to https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX and you are ready to go. A sample message sent from Nagios to Slack is shown in the following diagram:

Slack integration

The message's JSON payload is well described in Slack's documentation; here is a quick recap of the most important fields:

Option

Description

text

Message text

channel

Specifies a target of the message. It can be both a channel (in the form #channelName) or user (@userName)

username

String that will be displayed on Slack as message sender

icon_url

URL to icon that will be displayed along with the message

icon_emoji

Code of Slack's built-in emoji emoticons, for example, :warning:

Although the provided example of Nagios-Slack integration is rather simple and straightforward, it can be enhanced in many ways. For example, by specifying an icon that would reflect service state or selecting the appropriate user/channel to send a message to based on notification parameters. One interesting example is available here: http://matthewcmcmillan.blogspot.com/2013/12/simple-way-to-integrate-nagios-with.html, where the author decided to put message formatting logic into a separate shell script.

Please note that at the time of writing the script did not include full support for notifications regarding host and/or service status acknowledgements—the notifications will still be sent properly, but may lack information about the user that has acknowledged the problem and comment.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.28.48