Monitoring PING for any host

In this recipe, we'll learn how to set up PING monitoring for a host. We'll use the check_ping plugin, and its command of the same name, to send ICMP ECHO requests to a host. We'll use this as a simple diagnostic check to make sure that the host's network stack is responding in a consistent and timely fashion, in much the same way an administrator might use the ping(8) command interactively to check the same properties.

Getting ready

You should have a Nagios Core 4.0 or newer server with at least one host configured already. We'll use the example of corinth.example.net, a host defined in its own file. You should also understand the basics of how hosts and services relate, which is covered in the recipes in Chapter 1, Understanding Hosts, Services, and Contacts.

How to do it...

We can add a new PING service check to our existing host as follows:

  1. Change to the objects configuration directory for Nagios Core. The default is /usr/local/nagios/etc/objects. If you've put the definition for your host in a different file, move to that directory instead.
    # cd /usr/local/nagios/etc/objects
    
  2. Edit the file containing the definition for the host. The host definition might look something like this:
    define host {
        use        windows-server
        host_name  corinth.example.net
        alias      corinth
        address    192.0.2.27
    }
    
  3. Beneath the definition for the host, place a service definition referring to check_ping. It's recommended that you use the generic-service template, as follows:
    define service {
        use                  generic-service
        host_name            corinth.example.net
        service_description  PING
        check_command        check_ping!100,20%!200,40%
    }
    
  4. Validate the configuration and restart the Nagios Core server:
    # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
    # /etc/init.d/nagios reload
    

With this done, a new service check will start taking place, with the appropriate contacts and contact groups notified in the following scenarios:

  • The RTT (Round Trip Time) of the request and its response exceeds 200 ms, OR more than 40% of the packets are lost during the check; a CRITICAL notification is fired for the service in either case.
  • If a CRITICAL notification was not fired, and the RTT of the request and its response exceeds 100 ms, OR more than 20% of the packets are lost during the check; in this case, a WARNING notification is fired for the service.

The information about the thresholds is given in the definition for the check_command directive as arguments to the check_ping command.

Information about this service will also be visible in the web interface under the Services section.

How it works...

The preceding configuration defines a new service check on the existing corinth.example.net host to check that the RTT and the packet loss for an ICMP ECHO request and response are within acceptable limits.

For most network configurations, it may well be the case that the host itself is also being checked by check_ping by way of the command check-host-alive. The difference is that the thresholds for RTT and packet loss are intentionally set very high for this command, because it is intended to ensure whether the host is up or down at all, not how responsive it is.

There's more...

In networks where all the hosts are configured to respond to ICMP ECHO requests, it could perhaps be worthwhile to configure service checks on all of them in a configuration. This can be done using the * wildcard when defining the host_name for the service:

define service {
    use                  generic-service
    host_name            *
    service_description  PING
    check_command        check_ping!100,20%!200,40%
}

This will apply the same check, with a service_description of PING, to all the hosts configured in the database. This method will save the hassle of configuring a service separately for all the hosts.

If some of the hosts in a network do not respond to PING, it may be more appropriate to place the ones that do in a hostgroup, perhaps named something like icmp:

define hostgroup {
    hostgroup_name  icmp
    alias           ICMP enabled hosts
    members         sparta.example.net,corinth.example.net
}

The single service can then be applied to all the hosts in that group using the hostgroup_name directive in the service definition:

define service {
    use                  generic-service
    hostgroup_name       icmp
    service_description  PING
    check_command        check_ping!100,20%!200,40%
}

It's generally a good idea to have network hosts respond to ICMP messages where possible in order to comply with the recommendations in RFC1122 and to ease debugging.

Finally, note that the thresholds for the round trip time and packet loss are not fixed; in fact, they're defined in the service definition, in the check_command line. For hosts that have higher latency, perhaps due to network load or topology, it may be appropriate to adjust these thresholds; this is covered in the recipe Changing thresholds for ping RTT and packet loss in Chapter 3, Working with Checks and States.

See also

  • Creating a new host, Chapter 1, Understanding Hosts, Services, and Contacts
  • Creating a new service, Chapter 1, Understanding Hosts, Services, and Contacts
  • Creating a new hostgroup, Chapter 1, Understanding Hosts, Services, and Contacts
  • Changing thresholds for ping RTT and packet loss, Chapter 3, Working with Checks and States
  • Using an alternative check command, Chapter 2, Working with Commands and Plugins
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.34.146