Viewing and interpreting trends

In this recipe, you'll learn how to use the Host and Service State Trends reporting tool on a host or service to show a graph of states over some fixed period of time. This can be useful to determine not only the overall availability, perhaps to meet the terms of a service-level agreement, but also to ascertain whether there are certain intervals or consistent times that the host enters a state that is not OK. It's a good way to look for patterns in the downtime of your hosts.

Getting started

You will need access to the Nagios Core web interface and permission to run commands from the CGIs. The sample configuration installed by following the Quick Start Guide provides the nagiosadmin user all the necessary privileges when authenticated via HTTP.

If you find that you don't have this privilege, check the authorized_for_all_services and authorized_for_all_hosts directives in /usr/local/nagios/etc/cgi.cfg and include your username in both, for example, tom:

authorized_for_all_services=nagiosadmin,tom
authorized_for_all_hosts=nagiosadmin,tom

Alternatively, you should also be able to see a host or service's information if you are authenticating with the same username as the nominated contact for the host or service you want to check.

In this example, we'll view a month's history for the CPU load service on roma.example.net, a web server for which we've been running checks.

In Nagios 4.1.0, a new version of the Trends report was introduced using the new JSON data sources available to the CGIs. We'll demonstrate this one. The older version of the Trends report is still available. Just click on the (Legacy) link next to Trends.

How to do it...

We can arrange a Service State Trends report for the last month for our roma.example.net server by following these steps:

  1. Log in to the Nagios Core web interface.
  2. Click on the Trends link in the left-hand side menu, beneath Reports:
    How to do it...
  3. A pop-up window should appear. Select the type of report, either Host or Service:
    How to do it...
  4. Select a specific host and/or service on which we need to report:
    How to do it...
  5. Choose a time period for the report. In this example, we'll choose the Last 7 Days report, to see how the service has been performing over the past week. Click on Apply when you're done:
    How to do it...

You should be presented with a graph showing the state of the host or service over time, along with the markings of time for when the state changed, and a percentage breakdown of the relative states to the right.

A healthy service might look like this, with a few blips or none at all:

How to do it...

A more problematic service might have long periods of time in the WARNING or CRITICAL state:

How to do it...

You can double-click on sections of the graph to zoom in on them, provided that you did not select the Suppress image map option on the Advanced page of the report dialog.

How it works...

Nagios Core assembles state changes from its log files for the specified time period and constructs the graph of state changes by color, delineating the dates on the horizontal axis and at regular intervals. The trends graph, therefore, only works for times covered by your archived log files. The third step of building the report involves a lot of possible options, if you switch to the Advanced tab, which are as follows:

  • Report period: This is the drop-down menu that allows us to choose a fixed period, for convenience, relative to the current date; alternatively, a custom time period may be used by selecting the final CUSTOM REPORT PERIOD option and selecting dates in the two fields that follow:
    • Start Date (Inclusive): This refers to the date on which the report should start, if the custom time period option has been set.
    • End Date (Inclusive): This is the date on which the report should end, if the custom time period option has been set.
    • Assume Initial States: If Nagios Core can't figure out the initial state of the service at the time the report begins until the first check, it will assume that it is based on the value of the First Assumed State field.
    • Assume State Retention: If Nagios Core was restarted one or more times during the reporting period, checking this option will make the report assume that the state before any restart was retained by Nagios Core until it started again; this is enabled with the retain_state_information directive in nagios.cfg.
  • Assume States During Program Downtime: If Nagios Core finds it was down for a period in its log files, it will assume the final state it reads from the host or service while it was down for that time.
  • Include Soft States: Nagios Core will graph SOFT states, meaning that it will include state changes that do occur but return to their previous state before max_check_attempts is exhausted. Otherwise, it will only graph states that have endured right through the retry checks, or HARD states.
  • First Assumed State: This is the value Nagios Core should assume for the host or service state if it can't determine the value from the log files.
  • Backtracked Archives: This shows the number of archived log files that the Nagios Core process should check to try and find the initial states for the host or service.
  • Suppress image map: This prevents the graph from being clickable to zoom in on a particular region of it, perhaps for browser compatibility.
  • Suppress popups: This prevents the graph from showing popups when hovering over its sections, perhaps for browser compatibility.

There's more...

Note that it's important to ensure that checks have actually been running for the entire time for which you're running the report, as otherwise the State Breakdowns section will have distorted statistics. There is not much point running a yearly report for a host that has only existed for 6 months! In general, the more frequent and consistent your checks, the more accurate the trends graph will be.

See also

  • The Viewing and interpreting availability reports recipe in this chapter
  • The Viewing and interpreting notification history recipe in this chapter
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.134.151