Extracting content from web pages

When creating the web-monitoring scenario, we extracted content from a page to be reused later. With simpler agent monitoring, it's still possible to extract some content from a page. As a test, we could try to extract the text after remembers me for and up to days. Click on Create item again, and fill in the following:

  • Name: Zabbix remembers me for
  • Key: web.page.regexp[127.0.0.1,/zabbix/index.php,,"Remember me for.(d.)",,1]
  • Type of information: Character
  • Applications: ZABBIX

When you've done that click on the Add button at the bottom.

The item key works with Zabbix server page contents at the time of writing. If the web page gets redesigned, consider it an extra challenge to adapt the regular expression.

For this item, we are extracting search results from the page directly. The important parameter here is the fourth one, it is a regular expression that will be matched in the page source. In this case, we are looking for the remember me for string and including two digits after it. When the regular expression contains a comma, it's best to double-quote it. A comma is the item key parameter separator, so it could be misinterpreted. Then, in the last parameter, we request only the contents of the first capture group are included. By default, the whole matched string is returned. For more details on value extraction with this method, refer to the Log file monitoring section in Chapter 10, Advanced Item Monitoring. We also chose Type of information as Character, which will limit the values to 255 symbols, just in case it matches a huge string.

For this key, the fifth parameter allows us to limit the length of the returned key. If you want to extract a number and send it over SMS, limiting the length of the extracted string to 50 characters would reduce the possibility of the message being too long.

A practical application of this item would be extracting statistics from an Apache web server when using mod_status or similar functionality with other server software.

None of the three web.page.* items supports HTTPS, authentication, or redirects at this time.

With the items configured, let's check their returned values—head to Monitoring | Latest data, clear out the Host groups field, select Linux servers in the Hosts field, and then click on Filter. Look for items in the Zabbix application:

Each item requests the page separately.

The items should be returning full page contents, the time it took to load the page, and the result of our regular expression. The web.page.get item always includes headers, too. If you see empty values appearing every now and then in the web.page.get and web.page.regexp items, this is probably happens because the request has timed out. While web scenarios had their own timeout setting, the agent items obey the agent timeout of 3 seconds by default. The web.page.perf item returns 0 upon a timeout.

The Zabbix web.page.get item currently does not work properly with chunked transfer encoding, which is widely used. Extra data is inserted in the page contents. This was expected to be fixed in Zabbix 3.0, by using libcurl for these agent items as well, but that development was not finished. At the time of writing, it is not known when this will be fixed.

Using these items, we could trigger when a page takes too long to load, when it doesn't work at all, or when a specific string cannot be found on the page by using str() and similar trigger expressions either on the whole page item or on the content extraction item.

Web scenarios are executed on the Zabbix server, agent items on the agent. We will discuss running web scenarios on remote systems in Chapter 17, Using Proxies to Monitor Remote Locations.

The items we created all went to the same Zabbix agent. We can also create a host with multiple interfaces and assign items to each interface. This allows us to check a web page from multiple locations but keep the results in a single host. We still have to make the item keys unique—if needed, either use the trick with empty key parameters, extra commas in key parameters, or key aliasing, discussed in Chapter 20, Zabbix Maintenance. Note that templates can't be used in such a setup.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.96.94