Configuring alerts

Although monitoring itself is a very broad term and as a job may include a lot of activities, there are basically two sides to what a web service monitoring constitutes. Graphing values from sensors is important as a tool to make strategic decisions based on visually apparent trends and also a post factum investigation and sometimes even as a forensics tool. The other side is alerting, which allows administrators to react to incidents as quickly as possible, preventing business consequences. All major monitoring solutions include alerting subsystems and even the toy monitoring script that we built earlier in this chapter is basically a simple alerter.

The http_stub_status module provides scarce information, but it can still be used to quickly react to incidents. These are some good values that can be used to detect unusual conditions requiring immediate attention:

  • The incoming request rate is a global indicator of the load to your website. Sudden spikes indicate a surge in popularity, which might be dangerous or a Denial of Service (DoS) attack, which is always dangerous. A sudden dip may mean a failure, even a total meltdown.
  • A high number of dropped connections may indicate that the configuration is not up to the load. Nginx does not drop connections unless it has to due to either resource shortage or, which happens more often, meeting resource limits.
  • The number of active (including waiting) connections may sometimes mean an attempt to drain your connection limits by a malicious actor. This is exactly what Nginx usually manages to defend itself against without any external help, but it is still important to monitor and investigate these events.

You can use any number of complex alert conditions using these three and also invent more metrics relevant to your particular business. The method to create such alerts is specific to the monitoring system you decided to choose. These are the example alerts we use inside ServerDensity:

Configuring alerts
  • Munin uses a simple system of two-level thresholds attached to plugins as a way to generate alerts. There are "warning" and "critical" thresholds, and once the actual value reaches one of the thresholds, Munin will generate an alert and send a notification. The system allows default thresholds for all hosts, which may be practical.
  • The more popular way is to set up thresholds for all hosts individually. It is done right in the Munin master configuration file, which is called munin.conf. This file contains a list of all hosts that are under monitoring by this very master. Adding a threshold for a metric available on a host is as simple as this:
    [www1.example.com]
        nginx_status.total.warning 0:10
        nginx_status.total.critical 0:20

    The values are specified as ranges; that is, in the earlier case, the warning would be generated only when the nginx.status.total metrics (which is the total number of current connections) will leave the range from 0 to 10.

    Munin is able to send notifications about the alerts via e-mail, via syslog, via piping to external programs, or via Nagios, which is another popular monitoring system with a more sophisticated alerting subsystem. Configuration of the notification settings is easy enough. See the online Munin guide at http://guide.munin-monitoring.org/en/latest/tutorial/alert.html.

  • All of the interesting metrics may be measured right on the front between Nginx and its clients and also on each external link that Nginx has due to upstream configuration. Unfortunately, the open source Nginx does not contain any means to expose those metrics from the upstream links to monitor and set up alerts on.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.188.238