Establishing a service dependency

In this recipe, you'll learn how to establish a service dependency between two services. This feature can be used to control how Nagios Core checks for hosts and notifies about problems in situations where if one host is in a PROBLEM state, it implies that at least one other service is necessarily also in the PROBLEM state.

Getting ready

You will need Nagios Core 4.0 or a newer server and to have shell access to change its backend configuration. You will also need to have at least two services defined, one of which is by definition dependent on the other; this means that if the depended-upon service were to enter CRITICAL state, then it would imply that the dependent service would also be CRITICAL.

We'll use a simple example: let's suppose that we are testing authentication to a mailserver marathon.example.net with a service MAIL_LOGIN and also checking a database service MAIL_DB on the same host that stores the login usernames and password hashes.

In this situation, it might well be the case that if MAIL_DB is not working, then MAIL_LOGIN will almost certainly not be working either. If so, then we can configure Nagios Core to be aware that the MAIL_LOGIN service is dependent on the MAIL_DB service.

How to do it…

We can establish our service dependency like so:

  1. Change to the objects' configuration directory for Nagios Core. The default is /usr/local/nagios/etc/objects. If you've put the definition for your host in a different file, move to its directory instead via the following code:
    # cd /usr/local/nagios/etc/objects
    # vi dependencies.cfg
    
  2. Create or edit an appropriate file that will be included by the configuration in /usr/local/nagios/etc/nagios.cfg. A sensible choice could be /usr/local/nagios/etc/objects/dependencies.cfg.
  3. Add a servicedependency definition. In our case, the definition looks similar to this:
    define servicedependency {
        host_name                      marathon.example.net
        service_description            MAIL_DB
        dependent_host_name            marathon.example.net
    
        dependent_service_description  MAIL_LOGIN
        execution_failure_criteria     c
        notification_failure_criteria  c
    }
    
  4. Edit nagios.cfg to include a reference to this new file so that it's included in the configuration, as follows:
    cfg_file=/usr/local/nagios/etc/objects/dependencies.cfg
    
  5. Validate the configuration and restart the Nagios Core server by executing the following:
    # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
    # /etc/init.d/nagios restart
    

With this done, if the MAIL_DB service fails for whatever reason and enters the CRITICAL state, the MAIL_LOGIN service will skip its checks of this service and also any notifications that it would normally send about its own problems. Note that the web interface may still show it scheduling checks, but it won't actually run them.

How it works…

The service dependency object's five directives are as follows:

  • host_name: This is the name of the host with which these services are associated.
  • service_description: This is the description of the service being depended upon. This can be a comma-separated list.
  • dependent_service_description: This is the description of the dependent service. This can also be a comma-separated list.
  • execution_failure_criteria: This defines a list of states for the service being depended upon. If this service is in any of these states, then Nagios Core will skip checks for the dependent services. This can be a comma-separated list of any of the following flags:
    • o: The depended-upon service is OK
    • w: The depended-upon service is WARNING
    • c: The depended-upon service is CRITICAL (as in this example)
    • u: The depended-upon service is UNKNOWN
    • p: The depended-upon service is PENDING (that is, not checked yet)

    Alternatively, the single n flag can be used to specify that the checks should take place regardless of the depended-upon service's state. In this example, we chose the c value, to suppress service checks only if the service being depended upon is CRITICAL.

  • notification_failure_criteria: This defines a list of states for the service being depended upon. If this service is in any of these states, then notifications for the dependent service will not be sent. The flags are the same as for execution_failure_criteria; in this example, we again chose the c value to suppress the notifications only if the service being depended upon is CRITICAL.

When Nagios Core notices that the MAIL_DB service has gone into the CRITICAL state as a result of a failed service check, it refers to its configuration to check whether there are any dependencies for the service and finds that they depend on MAIL_LOGIN.

It then checks the status of MAIL_DB and finds it to be CRITICAL. Referring to execution_failure_criteria and finding c, it prevents checking for both of the dependent services. Referring to notification_failure_criteria and also finding c, it decides that notifications should be suppressed until the service returns to any other state.

There's more…

Note that services do not have to be on the same host to depend upon one another. We can add dependent_host_name or dependent_hostgroup_name to specify other hosts, as follows:

define servicedependency {
    host_name                      marathon.example.net
    service_description            MAIL_DB
    dependent_host_name            sparta.example.net
    dependent_service_description  WEBMAIL_LOGIN
    execution_failure_criteria     c
    notification_failure_criteria  c
}

In this example, the WEBMAIL_LOGIN service on sparta.example.net is defined as dependent on the MAIL_DB service on marathon.example.net. Note that the values for host_name and dependent_host_name are different.

See also

  • The Establishing a host dependency recipe in this chapter
  • The Monitoring individual nodes as a cluster recipe in this chapter
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.244.153