Sending recovery messages

The setup we used only sent out messages when the problem happened. That was ensured by the Trigger value = PROBLEM condition, which was added by default. One way to also enable the sending of messages when a trigger is resolved would be to remove that condition, but it will not be useful when escalation functionality is used. Thus, it is suggested leaving that condition in place and enabling recovery messages on the action level instead.

Let's enable recovery messages for our SNMP trap action:

  1. Go to Configuration | Actions, click on SNMP action in the Name column, and select the Recovery operations tab. Now, we can customize the recovery message. Instead of sending similar messages for problems and recoveries, we can make recoveries stand out a bit more. Hey, that's a good idea. We will be sending out emails to management, so let's add some feel good thing here.
  2. In the Operations box from our recovery tab, we tell Zabbix to send messages to our user group Our users through all media.
Do not remove the trigger value condition when enabling recovery messages. Doing so can result in recovery messages being escalated, and thus generate a huge amount of useless messages.
  1. Click on the Update button.

This will make the outgoing recovery messages have a sort of a double affirmation that everything is good—the subject will start with Resolved: with the name of the event:. To test the new configuration, set the trap to generate a problem and wait for the problem to resolve. This time, two emails should be sent, and the second one should come with our custom subject.

In the email that arrives, note the line at the very end that looks similar to this:

Original event ID: 1313 

The number at the end of the line is the event ID—a unique identifier of the occurrence of the problem. It is actually the so-called original event ID. This is the ID of the original problem, and it is the same in the problem and recovery notifications. A very useful approach is automatically matching recovery messages with the problem ones when sending this data to an issue management or ticketing system. Recovery information can be used to automatically close tickets, or provide additional information for them.

This ID was produced by a macro, {EVENT.ID}, and, as with many other macros, you can use it in your actions. If you would want to uniquely identify the recovery event, there's yet another macro for that—{EVENT.RECOVERY.ID}.

There are a lot of macros, so make sure to consult the Zabbix manual for a full list of them.

You may or may not have already noticed but, in our recovery operation, we had no option to send out the recovery option right away or with a delay, as we could with our error message. This is an option that is not yet available. Also, imagine a scenario where we send out our error message after 10 minutes, but the problem is already resolved after 5 minutes. In this case, we will get an OK email after 5 minutes but no error message, as we delayed that one by 10 minutes. This can be confusing. Zabbix has introduced another tab—Update operations. This works in the same way as the recovery operations tab, but will send us an update if someone clicks the Ack button in Zabbix for this issue. Everybody involved will then receive updates.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.159.224