In this recipe, we'll learn how to set up PING monitoring for a host. We'll use the check_ping
plugin, and its command of the same name, to send ICMP ECHO requests to a host. We'll use this as a simple diagnostic check to make sure that the host's network stack is responding in a consistent and timely fashion, in much the same way an administrator might use the ping(8)
command interactively to check the same properties.
You should have a Nagios Core 4.0 or newer server with at least one host configured already. We'll use the example of corinth.example.net
, a host defined in its own file. You should also understand the basics of how hosts and services relate, which is covered in the recipes in Chapter 1, Understanding Hosts, Services, and Contacts.
We can add a new PING service check to our existing host as follows:
/usr/local/nagios/etc/objects
. If you've put the definition for your host in a different file, move to that directory instead.# cd /usr/local/nagios/etc/objects
define host { use windows-server host_name corinth.example.net alias corinth address 192.0.2.27 }
check_ping
. It's recommended that you use the generic-service
template, as follows:define service { use generic-service host_name corinth.example.net service_description PING check_command check_ping!100,20%!200,40% }
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload
With this done, a new service check will start taking place, with the appropriate contacts and contact groups notified in the following scenarios:
CRITICAL
notification is fired for the service in either case.CRITICAL
notification was not fired, and the RTT of the request and its response exceeds 100 ms, OR more than 20% of the packets are lost during the check; in this case, a WARNING
notification is fired for the service.The information about the thresholds is given in the definition for the check_command
directive as arguments to the check_ping
command.
Information about this service will also be visible in the web interface under the Services section.
The preceding configuration defines a new service check on the existing corinth.example.net
host to check that the RTT and the packet loss for an ICMP ECHO request and response are within acceptable limits.
For most network configurations, it may well be the case that the host itself is also being checked by check_ping
by way of the command check-host-alive
. The difference is that the thresholds for RTT and packet loss are intentionally set very high for this command, because it is intended to ensure whether the host is up or down at all, not how responsive it is.
In networks where all the hosts are configured to respond to ICMP ECHO requests, it could perhaps be worthwhile to configure service checks on all of them in a configuration. This can be done using the *
wildcard when defining the host_name
for the service:
define service { use generic-service host_name * service_description PING check_command check_ping!100,20%!200,40% }
This will apply the same check, with a service_description
of PING, to all the hosts configured in the database. This method will save the hassle of configuring a service separately for all the hosts.
If some of the hosts in a network do not respond to PING, it may be more appropriate to place the ones that do in a hostgroup, perhaps named something like icmp
:
define hostgroup { hostgroup_name icmp alias ICMP enabled hosts members sparta.example.net,corinth.example.net }
The single service can then be applied to all the hosts in that group using the hostgroup_name
directive in the service definition:
define service { use generic-service hostgroup_name icmp service_description PING check_command check_ping!100,20%!200,40% }
It's generally a good idea to have network hosts respond to ICMP messages where possible in order to comply with the recommendations in RFC1122 and to ease debugging.
Finally, note that the thresholds for the round trip time and packet loss are not fixed; in fact, they're defined in the service definition, in the check_command
line. For hosts that have higher latency, perhaps due to network load or topology, it may be appropriate to adjust these thresholds; this is covered in the recipe Changing thresholds for ping RTT and packet loss in Chapter 3, Working with Checks and States.
18.216.34.146