Alerting is an important aspect of monitoring. There is no reason in having monitoring systems if we don't have efficient and impromptu alerting mechanisms in place. As discussed in previous chapters, Icinga provides a flexible way of configuring the type of alerts to send. Icinga itself isn't aware of the alerting mechanism. It simply knows when to call a notification (shell) command according to the Icinga notification configuration. Icing supplies the necessary details about the problem to the command, and the command takes over from there to use whatever mode of alerting it is programmed to use and actually send the alert. This flexibility allows us to easily extend and add custom alerting mechanisms, and simply configure Icinga to use it.
Each host/service definition has Icinga contact objects associated with it (using the contacts
directive, a comma-separated list). Each contact object has directives to specify the name of the Icinga command object that it should use for notification. This command object has a directive to specify the exact command (as it would look in a terminal) to execute the command object. So, when a host/service check goes critical, its contacts are looked up and the associated command objects are called, which in turn execute the command that actually handles the sending of the notification.
This notification command can be a script to send an e-mail or an SMS, or to have a message posted to a Jabber contact (using a Jabber bot), or an API call (using cURL for instance) to an external system. The script is usually put at a convenient location, which is called by the notification command and is executed to do its processing.
In this chapter, we will understand how to configure Icinga for notifying contacts about problems, and customize various associated parameters.
The default Icinga installation has basic monitoring already setup (the one that we saw for localhost monitoring) along with e-mail alerting. Let's take a quick look again at one of the service checks, HTTP.
The service check definition, as in localhost.cfg
, is as follows:
define service { use local-service host_name localhost service_description HTTP check_command check_http # notifications_enabled 0 ; make sure this is commented }
The HTTP check inherits the local-service
template service, which in turn inherits the generic-service
template service that has the contact information (templates.cfg
):
define service {
name generic-service
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 1
check_freshness 0
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
failure_prediction_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 10
retry_check_interval 2
contact_groups admins
notification_options w,u,c,r
notification_interval 60
notification_period 24x7
register 0
}
The preceding template has contact_groups
defined as admins
, which is an Icinga contactgroup
definition (essentially a group of Icinga contact objects) in contacts.cfg
:
define contactgroup {
contactgroup_name admins
alias Icinga Administrators
members icingaadmin
}
The admins contact
group has icingaadmin
as a contact member:
define contact {
use generic-contact
contact_name icingaadmin
alias Icinga Admin
email icinga@localhost
}
The email
directive in the preceding definition should reflect the e-mail address that we had set in Chapter 1, Installation and Configuration, in order to receive a test alert.
So far, we have the HTTP
service check associated with the admins
contact group which has icingaadmin
as a member contact, and which has an e-mail address. So far so good. We still don't have the notification command used to send alerts. Let's dig deeper.
The icingaadmin
contact inherits a generic-contact
template contact, which has the notification commands (templates.cfg
):
define contact { name generic-contact host_notification_period 24x7 service_notification_options w,u,c,r,f,s host_notification_options d,u,r,f,s service_notification_commands notify-service-by-email host_notification_commands notify-host-by-email register 0 }
The contact definition
template has both a service_notification_commands
directive and a host_notification_commands
directive that specify the notification commands to be used for service alert and host alert respectively. Since we have the HTTP service check into consideration for now, we will look at the former. The notify-service-by-email
object is an Icinga command object (commands.cfg
):
define command { command_name notify-service-by-email command_line /usr/bin/printf "%b" "***** Icinga ***** Notification Type: $NOTIFICATIONTYPE$ Service: $SERVICEDESC$ Host: $HOSTALIAS$ Address: $HOSTADDRESS$ State: $SERVICESTATE$ Date/Time: $LONGDATETIME$ Additional Info: $SERVICEOUTPUT$ " | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$ }
This command object definition specifies the long command_line
object to be executed for this command. The command line, in this case, builds the entire body of the e-mail, pipes it to /bin/mail
that is passed a subject, and sends the e-mail to the address given by $CONTACTEMAIL$
. The following figure illustrates the configuration objects and how they are related:
As you may have noticed, there are a lot of parameters used in the command line. These are the Icinga macros that can be used inside object definitions. The Icinga documentation has an exhaustive list of available macros (and the types of objects each is available in). Let's look at the ones used here, with the values they will take for the HTTP service check example:
$NOTIFICATIONTYPE$
: This macro takes the type of notification to be sent, that is, problem/recovery. When Icinga detects the service to be CRITICAL, it sends a problem notification, and when it becomes OK, it sends a recovery notification.$SERVICEDESC$
: This macro takes the value of the service_description
directive in the service object for the check this notification is for. Our example will take the HTTP value.$HOSTALIAS$
: This macro takes the value of the alias
directive in host object to which this service check belongs. If this directive is not specified, it takes the value of the host_name
directive. Our example will have a localhost as its value.$HOSTADDRESS$
: This macro takes the value of the host_address
directive in the same host object as above. Our example will take 127.0.0.1
as the value.$SERVICESTATE$
: This macro takes the current state of the service check, that is, either one from CRITICAL, WARNING, UNKNOWN, or OK. When the HTTP service is down, this value will be CRITICAL, and when it recovers, the value will be OK.$LONGDATETIME$
: This macro gives the date/time when the service check went from CRITICAL to OK. The format is defined with the date_format
directive in the main icinga.cfg
configuration file.$SERVICEOUTPUT$
: This macro gives the output of the service check as reported by the check plugin. Our example will show something like HTTP CRITICAL, which means that the macro is unable to open the TCP socket as the value when the check goes CRITICAL.$CONTACTEMAIL$
: This macro takes the value of the email
directive in contact definition, in which the said command object is defined. Our example will take the e-mail address that we had defined in Chapter 1, Installation and Configuration, ([email protected]
) as the value of this macro.Each of the preceding macros is replaced with its value in the command line and the resulting command string is executed. The resulting command string for our example will look similar to the following:
/usr/bin/printf "%b" "***** Icinga *****
Notification Type: PROBLEM
Service: HTTP
Host: localhost
Address: 127.0.0.1
State: CRITICAL
Date/Time: 07-03-2013 13:47:52
Additional Info:
HTTP CRITICAL - Unable to open TCP socket
" | /bin/mail -s "** PROBLEM Service Alert: localhost/HTTP is CRITICAL **" [email protected]
The e-mail message, as we would receive it, would look similar to the following:
From: icinga@localhost To: [email protected] Subject: ** PROBLEM Service Alert: localhost/HTTP is CRITICAL ** ***** Icinga ***** Notification Type: PROBLEM Service: HTTP Host: localhost Address: 127.0.0.1 State: CRITICAL Date/Time: 07-03-2013 13:47:52 Additional Info: HTTP CRITICAL - Unable to open TCP socket
We will also receive a similar e-mail when the service recovers.
18.226.98.208