In this recipe, you'll learn how to use the Availability Report to build a table showing statistics for a host, hostgroup, service, or servicegroup. This is useful as a quick metric of overall availability, perhaps to meet the terms of a service-level agreement.
You will need access to the Nagios Core web interface and permission to run commands from the CGIs. The sample configuration installed by following the Quick Start Guide gives the nagiosadmin
user all the necessary privileges when authenticated via HTTP.
If you find that you don't have this privilege, check the authorized_for_all_services
and authorized_for_all_hosts
directives in /usr/local/nagios/etc/cgi.cfg
and include your username in both, for example tom
:
authorized_for_all_servicess=nagiosadmin,tom authorized_for_all_hosts=nagiosadmin,tom
Alternatively, you should also be able to see a host or service's information if you are authenticating with the same username as the nominated contact for the host or service you want to check. This is explained in the Using authenticated contacts recipe in Chapter 10, Security and Performance.
In this example, we'll view a month's history for a CPU load
service on roma.example.net
, a Linux server.
We can arrange an Availability Report for the last month for our roma.example.net
server like so:
You should be presented with a table showing the percentage of time the host or service has spent in each state. A healthy service might look like this, with a few blips or none at all:
A more problematic service might have large percentages of time in the WARNING
or CRITICAL
state:
The preceding table appears and gives a quick visual summary of the time spent in each state, which if clicked, will link to a Trends Report with the same criteria. Additionally, below the table there are log entries for the service or host that is changing its state.
Nagios Core assembles state changes from its log files for the specified time period and constructs the table of state changes by percentage. The availability report, therefore, only works for the time covered by your archived log files. The third step of building the report involves a lot of possible options, which are as follows:
workhours
to see the percentage of uptime during a time that the server is expected to be busy.retain_state_information
directive in nagios.cfg
.SOFT
states, meaning that it will include state changes that do occur but return to their previous state before max_check_attempts
is exhausted. Otherwise, it will only graph states that have endured right through the retry checks, or HARD
states.You can choose to run the report for a hostgroup or servicegroup as well, which will yield an indexed table showing both the per-host or per-service percentage state time and also the average uptime for all the hosts or services in the group.
3.139.238.76