Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3. Working with Checks and States

In this chapter, we will cover the following recipes:

Specifying how frequently to check a host or service
Changing thresholds for PING RTT and packet loss
Changing thresholds for disk usage
Scheduling downtime for a host or service
Managing brief outages with flapping
Adjusting flapping percentage thresholds for a service

Introduction

Once hosts and services are configured in Nagios Core, its behavior is primarily dictated by the checks it makes to ensure that hosts and services are operating as expected and, in turn, as a result of these checks, it concludes the state in which these hosts and services must be.

How often it's appropriate to check hosts and services and on what basis it's appropriate to flag a host or service as problematic depends very much on the nature of the service and the importance of it running all the time. If a host on the other side of the world is being checked with PING and, during busy periods, its round trip time is over 100 ms, this may not actually be a cause of concern at all and perhaps is not something to even flag a WARNING state over, let alone a CRITICAL one.

However, if the same host was on the local network where it would be appropriate to expect a round trip time of less than 10 ms, a round trip time of more than 100 ms could well be considered a grave cause for concern, perhaps signaling a packet storm or another problem with the local network. In such cases, we would want to notify the appropriate administrators immediately. Similarly, for services such as web servers, we may not be concerned by a response time of more than a second for a page on a busy, budget-shared webhost for customers. However, if the response time for a corporate website or a dedicated colocation customer was bad, it might be important to notify the web server administrator about it.

Not all hosts and services are, therefore, equal. Nagios Core provides several ways to define behaviors with more precision, which are as follows:

How often a host or service should be checked with its appropriate check_command
How bad a check's results have to be before a WARNING or CRITICAL problem is flagged, if at all
Defining a downtime period for a host or service so that Nagios Core knows not to expect it to operate during a specified period of time, often for upgrades or other maintenance
Whether to automatically tolerate flapping or hosts and services that seem to go up and down a lot

This chapter will use some common instances of problems with the preceding behaviors to give you examples that show how to configure them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3. Working with Checks and States

Create new playlist

Sign In

Sign Up

Chapter 3. Working with Checks and States

Introduction

Table of Contents for
3. Working with Checks and States