Method 1 – last access item

There was the last access column in Administration | Proxies. Of course, looking at it all the time is not feasible, thus it can also be added as an internal item. To create such an item, do the following:

  1. Let's go to Configuration | Hosts, click on Items next to the host that runs your proxy, and click on Create item. Fill in the following values:
    • Name: $2: last access
    • Type: Zabbix internal
    • Key: zabbix[proxy,First proxy,lastaccess]
    • Units: unixtime
This item can be created on any host, but it is common to create it either on the Zabbix proxy host, or on the Zabbix server host.

In the key here, the second parameter is the proxy name. Thus, if your proxy was named kermit, the key would become zabbix[proxy,kermit,lastaccess].

If items like these are created on hosts that represent the proxy system and have the same name as the proxy, a template could use the {HOST.HOST} macro as the second parameter in this item key. We discussed templates in Chapter 8, Simplifying Complex Configurations with Templates.
  1. When done, click on the Add button at the bottom.

Notice how we used a special unit here: unixtime. Now what would it do? To find out, navigate to Monitoring | Latest data, expand the Filter, select the host you created the last item on, and enter proxy in the Name field, then click on the Filter button. Look at the way data is presented here, we can see very nicely, in a human-readable form, when the proxy last contacted the Zabbix server:

So this item will be recording the time when the proxy last contacted the Zabbix server. That's great, but hardly enough to notice problems in an everyday routine—we already know that a trigger is needed. Here, the already-familiar fuzzytime() function comes to the rescue.

Navigate to Configuration | Hosts, click on Triggers next to the host you created the proxy last access item on, then click on the Create trigger button.

Let's say we have a fairly loaded and critical proxy—we would like to know when three minutes have passed without the proxy reporting back. In such a case, a trigger expression such as this could be used:

{host:zabbix[proxy,proxy name,lastaccess].fuzzytime(180)}=0 
One could consider using the Simple change value for the last access item, which would return 0 when the proxy is not communicating. The trigger for such an item is more obscure, thus fuzzytime() is the most common trigger function for this purpose.

As we might recall, the proxy connected to the server in two cases—it either synchronized the configuration, or sent the collected data. What if, for some reason, all occurrences of both of these events are further apart than three minutes? Luckily, the Zabbix proxy has a heartbeat process, which reports back to the server at regular intervals. Even better, this timing is configurable. Again, take a look at zabbix_proxy.conf, this time looking for the HeartbeatFrequency variable, which by default looks like this:

# HeartbeatFrequency=60

Specified in seconds, this value means that the proxy will report back to the server every minute, even if there are no new values to send. The lastaccess item is quite a reliable way to figure out when a proxy is most likely down or at least inaccessible, even if it would not be sending data for a longer period of time.

For our trigger, fill in the following values:

  •  Name: Proxy "First proxy" not connected for 3 minutes
  •  Expression: {Another host:zabbix[proxy,First proxy,lastaccess].fuzzytime(3m)}=0
  • Severity: High
Replace the proxy name with the host name on which the proxy last access item was created. If the last access item used the {HOST.HOST} macro, use the same macro in the trigger name and expression, too.

We could have used 180 in place of 3m, but the time suffix version is a bit easier to read. Time suffixes were discussed in Chapter 6, Detecting Problems with Triggers. When done, click on the Add button at the bottom.

This combination of an item and a trigger will alert us when the proxy will be unavailable. Now we just have to set up trigger dependencies for all availability triggers behind this proxy on this proxy last access trigger.

Unfortunately, there's a common problem situation. When proxy-server communication is interrupted, the proxy last access trigger fires and masks all other triggers because of the dependency. While the proxy is unable to connect to the server for some time, it still collects the values. Once the communication is restored, the proxy sends all the values to the server, older values first. The moment the first value is sent, the last access item is updated and the trigger resolves. Unfortunately, at this point, the proxy is still sending values that were collected 5, 30, or 60 minutes ago. Any nodata() triggers that check a shorter period will fire. This makes the proxy trigger dependency work only until the proxy comes back, and results in a huge event storm when it does come back. How can we solve it? We could try to find out how many unsent values the proxy has, and if there are too many, ignore all the triggers behind the proxy—essentially, treating a proxy with a large value buffer the same as an unreachable proxy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.79.70