As established in Chapter 7, Using the Web Interface, some control over the Nagios Core process can be achieved by different areas of a web interface by enabling or disabling notifications or checks, scheduling downtime for hosts or services, or acknowledging existing programs. However, for large networks in particular, we may often need a more flexible approach to run operations on a large number of hosts or services in batches or to disable or enable features from another script, which is cumbersome to manage via the CGI.
For example, if we had a list of more than a hundred hosts in a segment of our network that were going to become inaccessible due to scheduled maintenance at a known time, it would take a very long time for us to set up all the scheduled downtime in the web interface to avoid sending out notifications for those hosts. It would be better to do this via a command line.
In this recipe, we'll install a custom script in /usr/local/bin
on our monitoring server called ncdt
, which is short for Nagios Core Downtime, that can be used to create downtime quickly for named hosts in the given time period. The downtime can be created using the SCHEDULE_HOST_DOWNTIME
custom command that is written to the external commands file located at /usr/local/nagios/var/rw/nagios.cmd
. We'll then use this script to loop over a list of hostnames to quickly set up an appropriate fixed scheduled downtime for all of them.
You'll need to have a server running Nagios Core 4.0 or newer version. In this example, we'll provide the complete ncdt
shell script as part of the instructions, but, to understand the script and the loop over the list of hosts, some experience with shell script would be helpful. On your monitoring server, you will also need GNU Bash 2.05a or newer, date(1)
from GNU sh-utils
2.0 or newer, and the administrative rights to create new scripts in /usr/local/bin
.
We can create, test, and apply the ncdt
shell script as follows:
/usr/local/bin
on the monitoring server or some other directory in your path suitable for custom scripts like this one:# cd /usr/local/bin
ncdt
:# vi ncdt
#!/usr/bin/env bash # Define some fields for the command now=$(date +%s) author=${SUDO_USER:-$USER} fixed=1 trigger=0 duration=0 comment='Batch downtime set by ncdt' # Read the hostname from the first argument hostname=${1:?} # Attempt to parse start and end dates; exit if the call doesn't work dtstart=$(date +%s --date "${2:?}") || exit dtend=$(date +%s --date "${3:?}") || exit # Write command and print message if it fails; succeed silently printf '[%lu] SCHEDULE_HOST_DOWNTIME;%s;%s;%s;%u;%u;%u;%s;%s ' "$now" "$hostname" "$dtstart" "$dtend" "$fixed" "$trigger" "$duration" "$author" "$comment" > /usr/local/nagios/var/rw/nagios.cmd
chmod(1)
:# chmod +x ncdt
For this example, we'll set a downtime for the sparta.example.net
host from "now" (the current time) until 5:00am tomorrow:
# ncdt sparta.example.net now '5:00am tomorrow'
/usr/local/nagios/var/nagios.log
:[1448444263] EXTERNAL COMMAND: SCHEDULE_HOST_DOWNTIME;sparta.example.net; 1448444263; 1448467200;1;0;0;tom;Batch downtime set by ncdt [1448444263] HOST DOWNTIME ALERT: sparta.example.net;STARTED; Host has entered a period of scheduled downtime
In this example, the output shows that the external command went through correctly and that the host immediately entered downtime, as expected.
downtime-list
file to quickly set the same period of downtime for a list of any number of hosts, using a while-read
loop:# while read -r hostname ; do > ncdt "$hostname" now '5:00am tomorrow' > done < downtime-list
The Bash script accepts three arguments: the hostname of the host that should enter downtime, the date and time the downtime should start, and the date and time it should end. The dates can be provided in a variety of formats and keywords understood by the GNU date(1)
; they are translated to Unix timestamps.
The script fails if any of these are not set. It calculates the current timestamp for the command time, then the name of the user writing the command, and uses fixed values for the remaining fields of the SCHEDULE_HOST_DOWNTIME
command. It formats the complete command with all its arguments with printf
, separating them with semicolons, and it writes the completed command to the external commands file to be processed by Nagios Core.
Once we know that the script is working, we're able to apply it in the for
or while
loops to quickly run the same script over a large number of hosts, perhaps in a file or a list pasted into the terminal window.
Custom scripts don't need to be written in shell script. They can be written in any programming language that suits, as the only filesystem interaction needed is to write a line to the external commands file. We could, for example, write scripts similar to ncdt
in Perl, Python, or Ruby.
Your scripts could have more features than this or be more user friendly; some ideas for expanding on this simple script might include the following:
man(1)
or help output when run with --help
Here are some other examples of tasks that might benefit from script shortcuts. Note that some of these would require NDOUtils or MK Livestatus as a backend to retrieve data from Nagios Core.
A complete list of the commands available for the external commands file is available at https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/extcommands.html.
3.12.123.189