In this recipe, we'll learn how to use the check_snmp
plugin to monitor the output made by SNMP (Simple Network Management Protocol) requests.
Despite its name, SNMP is not really a very simple protocol. However, it's a very common method for accessing information on many kinds of networked devices, including monitoring boards, usage meters, and storage appliances as well as workstations, servers, and routing equipment.
Because SNMP is so widely supported and typically able to produce such a large volume of information to trusted hosts, it's an excellent way of gathering information from hosts that's not otherwise retrievable from network services. For example, while checking for a PING response from a large router is simple enough, there may not be an easy way to check properties like the state of each of its interfaces, or the presence of a certain route in its routing tables.
Using check_snmp
in Nagios Core allows automated retrieval of this information from the devices and generating alerts appropriately. While its setup is somewhat complex, it is worth learning how to use it as it is among the most powerful plugins in Nagios Core for network administrators and it is quite typical to see dozens of commands defined for its use in a typical configuration for a large network. It can often be used to complement or even replace remote plugin execution daemons like NRPE or NSclient++.
You should have a Nagios Core 4.0 or newer server with at least one host configured already. You should also understand the basics of how hosts and services relate, which is covered in the recipes in Chapter 1, Understanding Hosts, Services, and Contacts.
This recipe assumes a basic knowledge of SNMP, including its general intended purpose, the concept of an SNMP community, and what SNMP MIBs and OIDs are. In particular, if you're looking to monitor some property of a networked device that's available to you via SNMP, you should know what the OID for that data is. This information is often available in the documentation for network devices, or can be deduced by running an appropriate snmpwalk
command against the host to view the output for all its OIDs.
You should check that an SNMP daemon is running on the target host and also that the check_snmp
plugin is available on the monitoring host. It is included as part of the standard Nagios Plugins so, provided the Net-SNMP libraries were available on the system when these were compiled, it should be available. If it is not, you may need to install the Net-SNMP libraries on your monitoring system and recompile the plugins.
We'll use the example of retrieving the total process count from a Linux server with hostname ithaca.example.net
and flagging WARNING
and CRITICAL
states at appropriate high ranges. We'll also discuss how to test for the presence or absence of strings rather than numeric thresholds.
It's a good idea to test that the host will respond to SNMP queries in the expected form. We can test this with snmpget
. Assuming a community name of public
, we could write:
$ snmpget -v1 -c public ithaca.example.net .1.3.6.1.2.1.25.1.6.0 iso.3.6.1.2.1.25.1.6.0 = Gauge32: 81
We can also test the plugin by running it directly as the nagios
user:
# sudo -s -u nagios $ /usr/local/nagios/libexec/check_snmp -H ithaca.example.net -C public -o .1.3.6.1.2.1.25.1.6.0 SNMP OK - 81 | iso.3.6.1.2.1.25.1.6.0=81
We can define a command and service check for the Linux process count OID as follows:
/usr/local/nagios/etc/objects
. If you've put the definition for your host in a different file, move to that directory instead.# cd /usr/local/nagios/etc/objects
commands.cfg
, and add the following definition to the end of the file.define command { command_name check_snmp_linux_procs command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .1.3.6.1.2.1.25.1.6.0 -w 100 -c 200 }
define host { use linux-server host_name ithaca.example.net alias ithaca address 192.0.2.61 }
public
with the name of your SNMP community if it differs:define service { use generic-service host_name ithaca.example.net service_description SNMP_PROCS check_command check_snmp_linux_procs!public }
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload
With this done, a new service check with a description of SNMP_PROCS
will be added to the ithaca.example.net
host and the check_snmp
plugin will issue a request for the value of the specified OID as its regular check. It will flag a WARNING
state if the count is greater than 100 and a CRITICAL
state if greater than 200, notifying accordingly. All this appears in the web interface the same way as any other service, under the Services menu item.
The preceding configuration defines both a new command based around the check_snmp
plugin and, in turn, a new service check using that command for the ithaca.example.net
server. The community name for the SNMP request, public
, is passed into the command as an argument; everything else, including the OID to be requested, is fixed into the check_snmp_linux_procs
command definition.
Part of the command line defined includes the -w
and -c
options. For numeric outputs like ours, these are used to define the limits for the value beyond which a WARNING
or CRITICAL
state is raised, respectively. In this case, we define a WARNING
threshold of 100 processes and a CRITICAL
threshold of 200 processes.
Similarly, if the SNMP check fails completely due to connectivity problems or syntax errors, an UNKNOWN
state will be reported.
It's also possible to test the output of SNMP checks to see if they match a particular string or pattern for determining whether the check succeeded. If we needed to check that the system's short hostname was ithaca
, for example (perhaps as a simple test SNMP query that should always succeed), we might set up a command definition as follows:
define command { command_name check_snmp_hostname command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .1.3.6.1.2.1.1.5.0 -r $ARG2$ }
With a corresponding service check like this:
define service { use generic-service host_name ithaca.example.net service_description SNMP_HOSTNAME check_command check_snmp_hostname!public!ithaca }
This particular check would only succeed if the SNMP query succeeds and returns a string matching the string ithaca
, as specified in the second argument.
18.219.103.183