Filtering notifications based on a host or service value

One of the new features of Nagios Core 4.0 is a means for specifying a value for hosts and services to allow for setting a threshold for notification behavior. This is implemented with the importance directive for hosts and services and the minimum_importance directive for contacts. This feature is designed to allow some flexibility for deciding how critical problems with specific hosts or services are and, hence, whether to alert different contacts depending on their severity.

For example, it might be appropriate to page a system administrator about a server used for internal development being inaccessible, but not to contact anybody else, as this service being down doesn't involve accountability to the public, or a great deal of lost revenue. For such a service, we would set a low value for the importance directive. However, for problems with a service such as a public-facing website for a government organization, it may be appropriate to notify both system administrators and management about problems and so we would set a high value for the importance directive.

For this example, we'll define a service on one web server corresponding to a development website and one on another corresponding to a public website. We'll then apply the importance and minimum_importance directives to filter notifications appropriately to the system administration and management teams.

Getting ready

You should have a Nagios Core 4.0.3 or newer server, with at least one host or service configured already. You should understand how notifications are generated and their default behavior in being sent to the contacts and contact_groups for hosts or services.

We'll use the example of a host called sparta.example.net, hosting a relatively unimportant development website, and one called athens.example.net, hosting a critical public-facing website. Both hosts are configured to send notifications to adam, a systems administrator, and brenden, a manager.

How to do it...

We can apply an importance for our hosts and a minimum_importance for our contacts as follows:

  1. Change to the objects configuration directory for Nagios Core. The default is /usr/local/nagios/etc/objects. If you've put the definition for your host in a different file, move to that directory instead.
    # cd /usr/local/nagios/etc/objects
    
  2. Edit the file or files containing your host definitions. For each relevant host, set the importance directive to some suitable value. Ours might look like the following; we set a much higher importance for the more critical host:
    define host {
        use                  linux-server
        host_name            sparta.example.net
        alias                sparta (development webserver)
        address              192.0.2.21
        notification_period  24x7
        contacts             adam,brenden
        importance           15
    }
    define host {
        use                  linux-server
        host_name            athens.example.net
        alias                athens (production webserver)
        address              192.0.2.22
        notification_period  24x7
        contacts             adam,brenden
        importance           100
    }
  3. Edit the file or files containing your contact objects, and set the minimum_importance for each to specify how urgent a notification needs to be before it is sent to that contact:
    define contact {
        use                generic-contact
        contact_name       adam
        alias              Adam (Sysadmin)
        email              [email protected]
        minimum_importance 10
    }
    define contact {
        use                generic-contact
        contact_name       brenden
        alias              Brenden (Manager)
        email              [email protected]
        minimum_importance 50
    }
  4. Validate the configuration and restart the Nagios Core server:
    # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
    # /etc/init.d/nagios reload
    

With this done, if the sparta.example.net server in this example were to go DOWN or become UNREACHABLE, only adam would receive notifications, even though brenden is listed as a contact. However, if athens.example.net goes down, both adam and brenden receive notifications.

How it works...

The importance for sparta.example.net was set at 15. When Nagios Core notices that it's DOWN, after completing all its retries, it triggers a notification. Looking at the contacts directive, it finds that adam and brenden are both set as contacts for the host. It notes that the importance for the host is greater than zero and compares it to the minimum_importance for each contact. Because 15 is greater than adam's minimum_importance value of 10, a notification is sent to him, but, because it's less than brenden's minimum_importance of 50, no notification is sent to brenden.

However, the importance value of athens.example.net is much higher, at 100. When it goes DOWN, notifications are sent to both adam and brenden, because the importance of 100 is greater than both of their minimum_importance settings.

This method only works if both the host's importance value and the contact's minimum_importance value are greater than zero, the default.

There's more...

In practice, for this simple example, it's likely that we'd instead simply specify only adam as the contact for sparta.example.net, which would have the same effect as our previous recipe. However, for large and complex setups with a lot of hosts, services, and contacts, using a numeric scale for gauging whether notifications should be sent to contacts may be more flexible than managing associated contacts per host or service. This is especially likely to be the case if your Nagios Core configuration is large and generated from some other tool, such as a configuration management database.

Adding to this flexibility, if you define importance for a host's services as well as for the host itself, then the importance for the host is calculated as the host value and all of its service values, added together.

From Nagios Core 4.0.0 through 4.0.2, the directives were named hourly_value and minimum_value instead of importance and minimum_importance; they were renamed in Nagios Core 4.0.3 and the old values are now deprecated. If you're using a version of Nagios Core 4.0.3 or beyond, you should use the new names instead to avoid warnings.

See also

  • The Choosing states for notification section in this chapter
  • The Defining an escalation for repeated notifications section in this chapter
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.239.77