Chapter 5. Host and Service Availability Reporting

Alerting is an important aspect of monitoring. There is no reason in having monitoring systems if we don't have efficient and impromptu alerting mechanisms in place. As discussed in previous chapters, Icinga provides a flexible way of configuring the type of alerts to send. Icinga itself isn't aware of the alerting mechanism. It simply knows when to call a notification (shell) command according to the Icinga notification configuration. Icing supplies the necessary details about the problem to the command, and the command takes over from there to use whatever mode of alerting it is programmed to use and actually send the alert. This flexibility allows us to easily extend and add custom alerting mechanisms, and simply configure Icinga to use it.

Each host/service definition has Icinga contact objects associated with it (using the contacts directive, a comma-separated list). Each contact object has directives to specify the name of the Icinga command object that it should use for notification. This command object has a directive to specify the exact command (as it would look in a terminal) to execute the command object. So, when a host/service check goes critical, its contacts are looked up and the associated command objects are called, which in turn execute the command that actually handles the sending of the notification.

This notification command can be a script to send an e-mail or an SMS, or to have a message posted to a Jabber contact (using a Jabber bot), or an API call (using cURL for instance) to an external system. The script is usually put at a convenient location, which is called by the notification command and is executed to do its processing.

In this chapter, we will understand how to configure Icinga for notifying contacts about problems, and customize various associated parameters.

Default configuration

The default Icinga installation has basic monitoring already setup (the one that we saw for localhost monitoring) along with e-mail alerting. Let's take a quick look again at one of the service checks, HTTP.

The service check definition, as in localhost.cfg, is as follows:

define service {
    use                     local-service
    host_name               localhost
    service_description     HTTP
    check_command           check_http
#   notifications_enabled   0       ; make sure this is commented
}

The HTTP check inherits the local-service template service, which in turn inherits the generic-service template service that has the contact information (templates.cfg):

define service {
    name                            generic-service
    active_checks_enabled           1
    passive_checks_enabled          1
    parallelize_check               1
    obsess_over_service             1
    check_freshness                 0
    notifications_enabled           1
    event_handler_enabled           1
    flap_detection_enabled          1
    failure_prediction_enabled      1
    process_perf_data               1
    retain_status_information       1
    retain_nonstatus_information    1
    is_volatile                     0
    check_period                    24x7
    max_check_attempts              3
    normal_check_interval           10
    retry_check_interval            2
    contact_groups              admins
    notification_options            w,u,c,r
    notification_interval           60
    notification_period             24x7
    register                        0
}

The preceding template has contact_groups defined as admins, which is an Icinga contactgroup definition (essentially a group of Icinga contact objects) in contacts.cfg:

define contactgroup {
    contactgroup_name       admins
    alias                   Icinga Administrators
    members              icingaadmin
}

The admins contact group has icingaadmin as a contact member:

define contact {
    use             generic-contact
    contact_name    icingaadmin
    alias           Icinga Admin
    email         icinga@localhost
}

The email directive in the preceding definition should reflect the e-mail address that we had set in Chapter 1, Installation and Configuration, in order to receive a test alert.

So far, we have the HTTP service check associated with the admins contact group which has icingaadmin as a member contact, and which has an e-mail address. So far so good. We still don't have the notification command used to send alerts. Let's dig deeper.

The icingaadmin contact inherits a generic-contact template contact, which has the notification commands (templates.cfg):

define contact {
    name                                generic-contact
    host_notification_period            24x7
    service_notification_options        w,u,c,r,f,s
    host_notification_options           d,u,r,f,s
    service_notification_commands   notify-service-by-email
    host_notification_commands      notify-host-by-email
    register                            0
}

The contact definition template has both a service_notification_commands directive and a host_notification_commands directive that specify the notification commands to be used for service alert and host alert respectively. Since we have the HTTP service check into consideration for now, we will look at the former. The notify-service-by-email object is an Icinga command object (commands.cfg):

define command {
    command_name    notify-service-by-email
    command_line    /usr/bin/printf "%b" "***** Icinga *****

Notification Type: $NOTIFICATIONTYPE$

Service: $SERVICEDESC$
Host: $HOSTALIAS$
Address: $HOSTADDRESS$
State: $SERVICESTATE$

Date/Time: $LONGDATETIME$

Additional Info:

$SERVICEOUTPUT$
" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}

This command object definition specifies the long command_line object to be executed for this command. The command line, in this case, builds the entire body of the e-mail, pipes it to /bin/mail that is passed a subject, and sends the e-mail to the address given by $CONTACTEMAIL$. The following figure illustrates the configuration objects and how they are related:

Default configuration

Relationship among service, contact group, contact, and command objects that form the notification configuration

As you may have noticed, there are a lot of parameters used in the command line. These are the Icinga macros that can be used inside object definitions. The Icinga documentation has an exhaustive list of available macros (and the types of objects each is available in). Let's look at the ones used here, with the values they will take for the HTTP service check example:

  • $NOTIFICATIONTYPE$: This macro takes the type of notification to be sent, that is, problem/recovery. When Icinga detects the service to be CRITICAL, it sends a problem notification, and when it becomes OK, it sends a recovery notification.
  • $SERVICEDESC$: This macro takes the value of the service_description directive in the service object for the check this notification is for. Our example will take the HTTP value.
  • $HOSTALIAS$: This macro takes the value of the alias directive in host object to which this service check belongs. If this directive is not specified, it takes the value of the host_name directive. Our example will have a localhost as its value.
  • $HOSTADDRESS$: This macro takes the value of the host_address directive in the same host object as above. Our example will take 127.0.0.1 as the value.
  • $SERVICESTATE$: This macro takes the current state of the service check, that is, either one from CRITICAL, WARNING, UNKNOWN, or OK. When the HTTP service is down, this value will be CRITICAL, and when it recovers, the value will be OK.
  • $LONGDATETIME$: This macro gives the date/time when the service check went from CRITICAL to OK. The format is defined with the date_format directive in the main icinga.cfg configuration file.
  • $SERVICEOUTPUT$: This macro gives the output of the service check as reported by the check plugin. Our example will show something like HTTP CRITICAL, which means that the macro is unable to open the TCP socket as the value when the check goes CRITICAL.
  • $CONTACTEMAIL$: This macro takes the value of the email directive in contact definition, in which the said command object is defined. Our example will take the e-mail address that we had defined in Chapter 1, Installation and Configuration, ([email protected]) as the value of this macro.

Each of the preceding macros is replaced with its value in the command line and the resulting command string is executed. The resulting command string for our example will look similar to the following:

/usr/bin/printf "%b" "***** Icinga *****

Notification Type: PROBLEM

Service: HTTP
Host: localhost
Address: 127.0.0.1
State: CRITICAL

Date/Time: 07-03-2013 13:47:52

Additional Info:

HTTP CRITICAL - Unable to open TCP socket
" | /bin/mail -s "** PROBLEM Service Alert: localhost/HTTP is CRITICAL **" [email protected]

The e-mail message, as we would receive it, would look similar to the following:

From: icinga@localhost
To: [email protected]
Subject: ** PROBLEM Service Alert: localhost/HTTP is CRITICAL **

***** Icinga *****

Notification Type: PROBLEM

Service: HTTP
Host: localhost
Address: 127.0.0.1
State: CRITICAL

Date/Time: 07-03-2013 13:47:52

Additional Info:

HTTP CRITICAL - Unable to open TCP socket

We will also receive a similar e-mail when the service recovers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.98.208