In this recipe, we'll learn how to check that the SSH daemon on a remote host is responding to requests using the check_ssh
plugin and the command of the same name. This will allow us to be notified as soon as there are problems connecting to the SSH service.
You should have a Nagios Core 4.0 or newer server with at least one host configured already. We'll use the example of troy.example.net
, a host defined in its own file. You should also understand the basics of how hosts and services relate, which is covered in the recipes in Chapter 1, Understanding Hosts, Services, and Contacts.
It may be a good idea to verify first that the host for which you want to add monitoring is presently running the SSH service that requires checking. This can be done by running the ssh(1)
client to make a connection to the host:
$ ssh troy.example.net
We should also check that the plugin itself will return the result required when run against the applicable host as the nagios
user:
# sudo -s -u nagios $ /usr/local/nagios/libexec/check_ssh troy.example.net
If you're unable to get a positive response from the SSH service on the target machine even if you're sure it's running, this could perhaps be a symptom of unrelated connectivity or filtering problems. We may, for example, need to add the monitoring server on which Nagios Core is running to the whitelist for SSH (normally TCP destination port 22
) on any applicable firewalls or routers.
We can add a new SSH service check to our existing host as follows:
/usr/local/nagios/etc/objects
. If you've put the definition for your host in a different file, move to that directory instead.# cd /usr/local/nagios/etc/objects
define host { use linux-server host_name troy.example.net alias troy address 192.0.2.25 }
check_ssh
. It may help to use the generic-service
template or another suitable template, as follows:define service { use generic-service host_name troy.example.net service_description SSH check_command check_ssh }
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # /etc/init.d/nagios reload
With this done, a new service check will start taking place with the appropriate contacts and contact groups being notified when an attempt to connect to the SSH server fails. The service check will be visible in the web interface on the Services page.
The preceding configuration defines a new service with a service_description
of SSH
for the existing troy.example.net
host using the values in the generic-service
template, and additionally defining a check_command
of check_ssh
.
This means that other than checking if the host itself is up with check-host-alive
, as was being done previously, Nagios Core will also check that the SSH service running on the host is working by attempting to make a connection to it; this includes notifying the applicable contacts if there are any problems found with the service after the appropriate number of tests.
For example, if the plugin finds that the host is accessible but not responding to client tests, it might notify with the following text:
Subject: ** PROBLEM Service Alert: troy.example.net/SSH is CRITICAL ** ***** Nagios ***** Notification Type: PROBLEM Service: SSH Host: troy.example.net Address: troy.example.net State: CRITICAL Date/Time: Mon Aug 27 21:15:12 NZST 2015 Additional Info: CRITICAL - Socket timeout after 10 seconds
Note that we don't need to actually supply credentials for the SSH check; the plugin simply ensures that the service is running and responding to connection attempts.
The definition for the check_ssh
command warrants some inspection if we're curious as to how the plugin is actually applied as a command, as defined (by default) in /usr/local/nagios/etc/objects/commands.cfg
:
define command { command_name check_ssh command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$ }
This shows that the check_ssh
command is configured to run the check_ssh
binary file in $USER1$
, a macro that normally expands to /usr/local/nagios/libexec
, against the host address of the applicable server. It adds in any other arguments beforehand by expanding $ARG1$
. We haven't used that macro in this recipe since we simply want to make a normal check of the SSH service on its default port.
This check should work with most SSH2 compliant servers, most notably including the popular OpenSSH server.
Checking SSH accessibility is a common enough thing for servers that it may well be worth setting up an SSH check service to apply for a hostgroup, rather than merely for an individual host. For example, if you had a group called ssh-servers
containing several servers that should be checked with a check_ssh
call, you could configure them all to be checked with one service definition using the hostgroup_name
directive:
define service { use generic-service hostgroup_name ssh-servers service_description SSH check_command check_ssh }
This would apply the same service check to each host in the group; this makes the definition easier to update if the check needs to be changed or removed in the future.
Note that the check_ssh
plugin is different from the check_by_ssh
plugin, which is used to run checks on remote machines, much like NRPE.
3.145.15.1