Now, what use would all of that collected data in Zabbix be without actually doing some alerting with it? Of course, we can use Zabbix to collect our data and just go over it manually, but Zabbix gets a lot more useful when we actually start sending out notifications to users. This way, we don't have to always keep an eye on our Zabbix frontend, but we can just let our triggers and alerts do the work for us, redirecting us to the frontend only when we need it.
In Zabbix 6, you will find a new trigger expressions syntax compared to Zabbix 5. This syntax has been available since Zabbix 5.4, so this is the first time we'll be working with it in an LTS release. If you've worked with Zabbix before version 6, keep in mind that you might need to get used to this new syntax.
We will learn all about setting up effective triggers with the new expression format and about alerts in the following recipes:
For this chapter, we will need a Zabbix server—for instance, the one used in the previous chapter, which would be the following:
We will also need a Linux host to monitor so that we can actually build some cool triggers to use.
Triggers are important in Zabbix because they notify you as to what's going on with your data. We want to get a trigger when our data reaches a certain threshold.
So, let's get started with setting up some cool triggers. There are loads of different options for defining triggers, but after reading this recipe you should be able to set up some of the most prominent triggers. Let's take your trigger experience to the next level.
For this recipe, we will need our Zabbix server ready and we will need a Linux host. I will use the lar-book-agent_simple host from the previous chapter because we already have some items on that.
We'll also need one more host that is monitored by the Zabbix agent with the Zabbix agent template. We'll use one of the items on this host to create a trigger. This will be the lar-book-agent_passive host from the previous chapter.
On this host, we will already have some triggers available, but we will extend these triggers further to inform us even better.
In this section, we are going to create three triggers to monitor state changes. Let's get started by creating our first trigger.
Let's create a simple trigger on the lar-book-agent_simple host. We made a simple check on this host called Check if port 22 is available, but we haven't created anything to notify us on this yet:
iptables -A INPUT -p tcp -i ens192 -s 10.16.16.152
--destination-port 22 -j DROP
ip addr
Now, to create our second trigger, let's ramp it up a bit. If you followed Chapter 3, Setting Up Zabbix Monitoring, in the recipe titled Setting up HTTP agent monitoring, we created an item that polls one of our website pages for its visitor count. Now, what we probably want to do ourselves is keep an eye out for how well all of our readers are doing reaching the web page part of the book and building an item for it:
We have seen triggers that use one item, but we can also use multiple items in a single trigger. Let's build a new trigger by using multiple items in the same expression:
ip addr
Tip: On the trigger creation page, use the Add button next to the Expression field to add a condition and build your expression easily. For example, we can use the Select button to pick an item from a list. Also very useful, when using the Function drop-down menu, there's a short explanation for every trigger function included:
We need a good understanding of how to build triggers and how they work, so we can create a well-set-up monitoring system. Especially important here is that we make sure that our triggers are set up correctly and we test them well. Triggers are a very important part of Zabbix as they will be informing you of things actively. Configure your triggers too loosely and you will be missing things. Configure them too strictly and you will be overloaded with information.
In all of these triggers, we have also included a trigger severity, as we can see in the screenshot below.
These severities are important to make sure your alerts will be correctly defined by importance. We can also filter on these severities in several places in the Zabbix frontend and even in things like actions.
Now, let's discover why we built our triggers as we did.
This is a very simple but effective trigger to set up in Zabbix. When our value returns us either a 1 for UP or a 0 for DOWN, we can easily create triggers such as these—not just for monitoring logical ports that are up or down, but for everything that returns us a simple value change from, for example, 1 to 0 and vice versa.
Now, if we break down our expression, we will see the following:
When building an expression, we have four parts:
Now, for our first trigger, we defined our host and the item that gives us the SSH status. What we are asking in the trigger function is that we want the last value to be 0 before triggering it.
For this item, that would mean it would trigger within a minute because in our item we specified the following:
Looking at the Update interval field on the Item configuration page, we can determine that when building this trigger, we are expecting our value to be 0 and that it will take a maximum of 1 minute of SSH port 22 downtime due to the 1-minute interval.
Now, for our second trigger, we did something different. We not only made an expression for triggering this problem, but also one for recovering from the trigger. What we do in the Problem expression option is define a trigger function, telling our host to compare the last value with the latest value and calculate the difference between the values. We then state this trigger function has to be >=50, meaning equal to or higher than 50.
So, our trigger will only be activated when we have more than or equal to 50 visitors between the last and previous value for this specific page. Now, we could do the same with the previous trigger and let it recover once it hits the same value again, but the other way around. This means that, for this trigger, it would recover once our visitor count between the last and previous value drops below 50 again. But I want to keep this trigger in the PROBLEM state just a little longer.
Therefore, I defined a recovery expression as well. I'm telling it that this problem can only recover if the visitor count between the last and previous value has dropped below or was equal to 40. Check out the recovery expression up close:
Recovery expressions are powerful when you want to extend your trigger functionality with just a bit more control over when it comes back into the OK state.
Tip
You can use the recovery expression for extending the trigger's PROBLEM state beyond what you defined in the Problem expression option. This way, we know we are still close to the PROBLEM state. We define that we only want the trigger to go back to the OK state after we've reached another threshold as defined in the recovery expression.
Now, trigger 3 might seem complicated because we've used more than one item, but it's basically the same setup:
We have the same setup for the expression, with function/host/item key/value. Yet when we are working with multiple items, we can add an or statement between the items. This way, we can say we need to match one of the items before triggering the PROBLEM state. In this case, we trigger when either item reaches above the threshold.
Important Note
In this trigger expression, we have some empty lines between the different item expressions. Empty lines between item expressions are totally fine and actually make for good readability. Use this wisely when building triggers.
Now if you've worked with Zabbix before, the next part might be interesting to you. As mentioned in our introduction, there is a big update to expressions within Zabbix. Trigger expressions now work in a new way, which is the same way as you will see in calculated items and other places for a unified experience.
Let's take a look at the old expression syntax as seen in Zabbix 5.2 and older versions:
In the old syntax, we always started with a curly bracket and then the hostname or template name. Between the hostname or template name and the item key, we had a colon. Marking the end of the item key we had a dot, but item keys can also include dots themselves. Then after the dot, we have the trigger function followed by the ending curly bracket. Then all we have left is the operator and constant we want to hold the expression against.
As you might see, this could become confusing at times, especially when using dots in item keys. Now let's check out the new trigger syntax:
Our new trigger syntax starts off right away with our trigger function; no hassle, just immediately showing you what we're doing with this line. This is followed by a bracket and a forward slash before entering the host or template name. We then use another forward slash to divide the hostname or template name and the item key. We end with a bracket and then all we have left is the operator and value we want to hold the expression against.
Starting with the trigger function makes for a clear indicator of what your line is doing. Putting the hostname or template name into brackets and then dividing it with forward slashes from the item key make for a more cohesive experience when writing expressions. We also don't have confusing extra dots any longer. Altogether a very nice change to the trigger syntax, which in all honesty might take a bit of time to get used to.
It's the small stuff that makes the entire software feel more professional and thought out. Zabbix including changes like these really helps that along.
Not only can we match one of the items in a trigger expression—but we can also do an and statement. This way, you can make sure our trigger only goes into a PROBLEM state when multiple items are reaching a certain value. Triggers are very powerful like this, allowing us to define our own criteria in great detail. There's no predefinition—we can add as many and/not/or statements and different functions as we like in the trigger expressions. Customize your triggers to exactly what you need, and suddenly you are going to have a lot more peace of mind because you know your triggers will notify you when something is up.
To know more about trigger expressions, check out the Zabbix documentation. There's a lot of information on which functions you can use to build the perfect trigger. For more details, go to https://www.zabbix.com/documentation/current/en/manual/config/triggers/expression.
Triggers in Zabbix keep getting more advanced and it might be hard to keep up. For people working with Zabbix 5.2 or older and upgrading to Zabbix 6, not only is there a new Zabbix trigger syntax but there's also a whole new array of functions.
Let's dive into setting up some more advanced triggers in Zabbix 6.
For this recipe, we will need our Zabbix server ready and we'll need one host that is monitored by a Zabbix agent with the Zabbix agent template. We'll use the items on this host to create triggers. Let's use the lar-book-agent_passive host from the previous chapter.
If you don't have this host from the previous chapters, simply hook up a new host with the default passive Linux monitoring template called Linux by Zabbix agent.
We'll also be touching on some more advanced topics that are discussed later in the book. If you don't know how to use Low-Level Discovery (LLD) for example, it might be smart to dive into Chapter 7, Using Discovery for Automatic Creation, first.
Let's take a look at three more advanced triggers, compared to the three we've seen in the previous recipe: trendavg for going through trend data, timeleft to predict values in the future, and time shifting to compare to the past.
First, we'll take a look at one of the newer trigger functionalities, the trend average function:
That's all for creating this trigger. Check out the How it works… section of this recipe to get more information about the trigger.
Next up is our timeleft function, which is very useful for things like space utilization. Let's take a look:
Important Note
In this case, we are creating the trigger prototype directly on the host, using an existing template discovery rule. If you want to apply a trigger like this to every host using a template, make sure to create the trigger on a template level. Furthermore, discovery rules are explained further in Chapter 7, Using Discovery for Automatic Creation, of this book.
Important Note
Using short intervals in predictive triggers to predict long time periods is not recommended. Make sure to use the right data set for the time period we want to use in relation to the time we want to predict.
We now have a new trigger using the timeleft function to tell us when hard disks are filling up within a week. Check out the How it works… section of this recipe to get more information about the trigger.
Lastly, we are going to work with time shifting and in this case, we'll do so in combination with a mathematical function. Time shifting is a little bit of a difficult example, so bear with me.
This is a very complex trigger to set up, so let's dive right into how it's set up in the How it works… section.
Advanced triggers can get very complex. The triggers we have just set up are just the tip of the iceberg. Do not worry if these triggers seem intimidating, as there is plentiful documentation out there to help you set them up, which we can find here: https://www.zabbix.com/documentation/current/en/manual/config/triggers.
It's near impossible to cover every single use case in this book, so the triggers we set up will show you what's possible. Use what you have learned in the examples in your own scenarios, but make sure to apply your own thinking to it.
Let's start off the How it works… section with the trend average. Trend average is one of the few trigger functions that use trend data instead of history data. Let's do a short crash course on the history and trend data in Zabbix. History data is the exact value every time an item reaches its configured update interval. Trend data is the average, minimum, and maximum value over one hour (1h) created from the history data and a count of the number of values.
Now, let's look at the available functions for creating triggers using trend data:
As I said, all of these will use our trend values. The values used are stored in a special Zabbix trend cache, for use in our trigger. We've used the trendavg function. Let's check out how we used it in our trigger expression again.
We start off our trigger with the function trendavg and then the host/template and item key as we've seen earlier in our last recipe. What's new here is the part where we state 1w:now-1w. This is the time period where we state to use a value from one week ago.
What this means is that, if the average value from our trends 1 week ago is above 800 Mbps, then this trigger will go into a problem state.
timeleft is another very interesting trigger function. We can use timeleft to create triggers that only fire when it expects something to reach a certain threshold in the future. This is called a predictive trigger, as it makes a prediction based on older data.
Let's check out our trigger expression again.
As we can see, we start our expression as usual: the trigger function, host/template, and our item key. In this case, we combine that with a time period we want to use for our predictive trigger to define its prediction. We use 7h, to tell this expression to use 7 hours of historic data. Combine that with a threshold of 100, to make sure this will trigger if we expect to reach 100% disk space usage. Now we only need one more element to complete this, the expected result, which in this case is <1w.
To sum it all up, this trigger expression looks at 7 hours of historic data and if it expects to reach 100% disk space in less than 1 week, it will go into a problem state, alerting you that you will need to make sure your disks don't run out of space.
Tip: Combine the timeleft trigger function with other functions to limit how many times you get alerted. For example, with disk space, we might expect a disk to fill up in a week, but you might not want to see that unless the used space is at least less than 50 Gigabytes. Add another expression and you are golden:
As a Zabbix trainer, time shifting trigger expressions is where I and my students always need to spend some additional time on what they are all doing exactly. This makes sense, as it is one of the more complex expressions, and in this example, we even combined it with some mathematical functions.
So let's take another look at our expression and break it down.
I've added line numbers for our convenience. Now we can go over each line and explain what they mean.
That's it for looking at the lines. Now that we know what they do, let's take a look at how it performs in a real-life scenario. We're going to fill out the values manually and see if the expression is TRUE or FALSE. TRUE means that there is a problem and FALSE means everything is fine. So the math is as follows:
(Last week - This week) = Result
If the Result is higher than 20 then the expression is True
This expression is: TRUE/FALSE
Filling it out with 80% memory available last week and only 50% available this week, we can see the following happening:
(80 - 50) = 30
If 30 is higher than 20 the expression is TRUE
This expression is TRUE
Let's do it one more time but with 80% memory available last week and 70% this week:
(80 - 70) = 10
If 10 is higher than 20 the expression is TRUE
This expression is FALSE
This is how you should go about setting up your time shifting expressions. Simply use a notebook or whatever you like, write down your expression in simple text for yourself, and do the calculations.
Trigger expressions can also be tested within Zabbix itself. If we go to Configuration | Hosts, then Triggers and we select any of our three advanced triggers, we can do a little test. For example, using the time shifting trigger, we can click Expression constructor.
Over here, we can select Test and then fill out our values. Let's use the same 80% and 50% we did in the earlier example.
As you can see, this will tell us whether our expression ends up being TRUE or FALSE, using any values we want to fill. In short, if you want to be sure your math on paper is doing the same thing directly in Zabbix, use the Expression constructor to test it.
Alerting can be a very important part of your Zabbix setup. When we set up alerts, we want the person on the other end to be informed of just what is going on. It's also important to not spam someone with alerts; we want it to be effective.
So, in this recipe, we will go over the basics of setting up alerts, so we know just how to get it right from the start.
For this recipe, we will only need two things. We will have to use our Zabbix server to create our alerts and we will need some triggers, like the triggers from the previous recipe. The triggers will be used to initiate the alerting process to see just how the Zabbix server will convey this information.
There already is one action set up to notify Zabbix administrators of problem events. In Zabbix 6, a lot of features such as Actions and Media are predefined. Most of the time, all we need to do is enable them and fill out some information.
Now, if you're like me and you want to stay on top of things, you are able to create an update notification. This way, we know that—for instance—someone acknowledged a problem and is working on it. Normally, I would select different channels for stuff such as this—for instance, using SMS for high-priority alerts and a Slack or Teams channel for everything else.
As you can see, there are quite a lot of predefined media types in Zabbix 6. We have them for Slack, Opsgenie, and even Telegram. Let's start with something almost everyone has, though: email.
We set this up like this so that we get a message telling us just what's going on. This is fully customizable as well, to reflect just what we want to know.
Now, that's how we set up alerts in Zabbix. You will now receive alerts on your email address, as shown in the following flowchart:
When something breaks, a PROBLEM in Zabbix is triggered by our trigger configuration. Our ACTION will then be triggered by our PROBLEM event and it will use the Media Type and User Media configuration to notify our user. Our user then fixes the issue (for instance, rebooting a stuck server), and then an OK event will be generated. We will then trigger the ACTION again and get an OK message.
Tip
Before building alerts such as this, make a workflow (as shown in Figure 4.36) for yourself, specifying just which user groups and users should be notified. This way, you keep it clear for yourself just how you will use Zabbix for alerting.
There are loads of media types and integrations, and we've just touched the tip of the iceberg by seeing a list of predefined ones. Make sure to check out the Zabbix integration list (https://www.zabbix.com/integrations) for more options or build your own using the Zabbix webhooks and other extensions available.
It's important to keep our alerts effective to make sure we are neither overwhelmed nor underwhelmed by notifications. To do this, we will change our trigger and the Email media type to reflect just what we want to see.
We will be using Trigger 1 from the first recipe and the default email media type in Zabbix.
Furthermore, of course, we'll also be using our Zabbix server.
To create effective alerts, let's follow these steps:
Even when you've used the macro {HOST.NAME} in the trigger, it's quite simple, so fortunately there isn't a lot to change here. If you've used the hostname in the trigger name, we can change the name to reflect a message that is clearer.
Now, Zabbix uses the default configured message under the media type when we do not use a custom message. But if we want to change that message, we can do that here by creating a custom message. Our default under the Email media type looks like the previous screenshot.
We've done two things in this recipe. We've changed our trigger name and we've added a tag to our trigger.
Keeping trigger names clear and defined in a structured way is important to keeping our Zabbix environment structured. Instead of just naming our trigger Port 22 SSH down on {HOST.NAME}, we've added standardization to our setup and can now do cool structuration such as this with our future triggers:
Our triggers are all clear and we can immediately see which host, port, and service are down.
On top of that, we've added a tag for the service that is down, which will now immediately display our service in a clear way, alerting us to exactly what is going on:
In Zabbix 6.0 there is a new tag policy. As we created the item used in the trigger with a component tag and we just added a trigger tag for scope, we followed the new standard. In the problem view in the screenshot above, it becomes immediately apparent that we have a problem affecting the availability of the TCP service SSH. The scope tag generally contains either one of 5 options: availability, performance, notification, security, and capacity.
For more information about the new Zabbix 6.0 tag policy, check out the link below:
https://blog.zabbix.com/tags-in-zabbix-6-0-lts-usage-subfilters-and-guidelines/19565/
Another thing we've done is remove the macro {HOST.NAME} if you've used it before. As we can already see which host this trigger is on by checking the Host field, we do not need to add the {HOST.NAME} macro. We need to keep trigger names short and effective and use the hostname macros in Media or simply use the field already available in the frontend.
We've also changed our action in this recipe. Changing a message on Media types is a powerful way to keep our problem channels structured. Sometimes, we want to see less or more information on certain channels and changing media type messages is one way to do this.
We can also create custom messages on an Action level, changing all the messages sent to the selected channels.
What I'm trying to show you in this recipe is that although it might be simple to set up Zabbix, it is not simple to set up a good monitoring solution with Zabbix—or any monitoring tool, for that matter—if you don't plan. Carefully plan out how you want your triggers to be structured before you build everything in your Zabbix installation.
An engineer that works in a structured way and that takes time to build a good monitoring solution will save a lot of hours in the future because they will understand the problem before anyone else.
Alerting is very useful, especially in combination with some of the tricks we've learned in this book so far to keep everything structured. But sometimes, we need a little more from our alerts than what we are already getting from Zabbix out of the box.
In this recipe, we'll do a small bit of customization to make the alerts more our own.
For this chapter, all we are going to need is our current Zabbix server installation.
To customize alerts, follow these steps:
After selecting this, we'll be taken to our next page. This window contains the default Zabbix Trigger severities, as shown in the following screenshot:
Not all companies like using terms such as High and Disaster, but prefer using different severities such as P1 and P2. Using custom severities, we can customize Zabbix to make it more our own and thus reflect what we've already been using in different tools, for example.
Changing custom severities is not a necessity by any means, but it can be a good way to adopt Zabbix more easily if you are used to something different.
3.145.8.153