Performance considerations

Zabbix tends to perform nicely for small installations, but as the monitored environment grows, one might run into performance problems. A full Zabbix performance discussion is out of scope here, but let's discuss the starting points to having a healthy configuration and the directions for further research:

  • Monitor only what you really need, as rarely as possible, and keep the data only for as long as really needed. It is common to new users of Zabbix to use default templates as-is, add a lot of new items with low intervals...and never look at the data. It is suggested to clone the default templates, eliminate all that is not needed, and increase the intervals as much as possible. This involves trimming item lists, increasing intervals, and reducing history and trend-retention periods. There are also events, alerts, and other data—we will discuss their storage settings a bit later.
  • When using Zabbix agents, use active items. Active items will result in a smaller number of network connections and reduce the load on the Zabbix server. There are some features not supported with active items, so sometimes, you will have to use passive items. We discussed what can and cannot be done with active items in Chapter 3, Monitoring with Zabbix Agents and Basic Protocols.
  • Use Zabbix proxies. They will provide bulk data to Zabbix Server, reducing the work the server has to do even further. We discussed proxies in Chapter 19, Using Proxies to Monitor Remote Locations.

We already know about the history and trend-retention periods for items—but for how long does Zabbix store events, alerts, acknowledgment messages, and other data? This is configurable by going to Administration | General and choosing Housekeeping in the dropdown in the upper-right corner:

Performance considerations

Note

This page is excessively long, so the preceding screenshot only shows a small section from the top.

Here, we may configure for how long to keep the following data:

  • Events: We may choose separate storage periods for trigger, internal, network discovery, and active agent autoregistration events. Note that removing an event will also remove all associated alerts and acknowledgment messages.
  • IT service data: The IT service up and down state is recorded separately from trigger events, and its retention period can be configured separately as well.
  • Audit data: This specifies how long to store the audit data for. We will discuss what that actually is in a moment.
  • User sessions: User sessions that have been closed will be removed more frequently, but active user sessions will be removed after 1 year by default. This means that one may not be logged in longer than a year

These values should be kept reasonably low. Keeping data for a long period of time will increase the database size, and that can impact the performance a lot.

What about the history and trend settings in here? While they're configurable per item normally, we may override those here. Also, for each of the entries, internal housekeeping may be disabled. These options are aimed at users who have to manage large Zabbix installations. When the database grows really large, its performance significantly degrades, and can be improved by partitioning the biggest tables—splitting them up by some criteria. With Zabbix, it is common to partition the history and trends tables, sometimes adding events and alerts tables. If partitioning is used, parts of tables (partitions) are removed, and the internal housekeeping for those tables should be disabled. A lot of people in the Zabbix community will eagerly suggest partitioning at the first opportunity. Unless you plan to have a really large installation or know database partitioning really well, it might be better to hold off. There is no officially supported or built-in partitioning scheme yet, and one might appear in the future. If it does and your partition scheme is different, it will be up to you to synchronize it with the official one.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.70.238