Alerting on abnormal web page response times

It is important that our web application remains responsive for users. Sites that lag frequently put off users and can result in them going elsewhere or lost sales. In Chapter 2, Diving into Data – Search and Report, you completed a recipe that analyzes average response times over a given period. In this recipe, you will create a scheduled alert to identify response times that are abnormal (that is, not within the normal range).

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server, with the sample data loaded from Chapter 1, PlayTime – Getting Data In. You should be familiar with navigating the Splunk user interface. You should also have configured the e-mail settings on your Splunk to enable the delivery of e-mail alerts.

Note

Should you be running your Splunk server on a desktop PC for the purposes of this book and wondering how to configure the e-mail settings, there is a good Splunk blog posting on configuring Splunk to work with Gmail and Yahoo Mail at http://blogs.splunk.com/tag/gmail/.

How to do it…

Follow the steps in this recipe to create an alert that identifies abnormal response times:

  1. Log in to your Splunk server.
  2. Select the Operational Intelligence application.
  3. Ensure that the time range picker is set to Last 60 minutes and type the following search into the Splunk search bar. Then, click on the magnifying glass icon or hit Enter.
    sourcetype=access_combined index=main | stats max(response) AS MAX by uri_path | join uri_path [search earliest=-25h latest=-24h sourcetype=access_combined index=main | stats avg(response) AS AVG by uri_path] | eval MAG=round(MAX/AVG)

    Tip

    This search relies on data being indexed on your Splunk server 25 hours ago. If this data is not present, you might need to adjust the earliest and latest times of the subsearch accordingly.

  4. Once the search completes, you should see tabulated data. In the Save As dropdown, select Alert.
    How to do it…
  5. A pop-up box will appear to provide the alert configuration options. Perform the following steps on this screen:
    1. Give the alert the Title cp08_abnormal_webpage_response.
    2. Select Shared in App as Permissions.
    3. Select Scheduled as Alert type, and select Run every hour from the dropdown below.
    4. Select Custom as the Trigger alert when value, and enter search MAG>5 in the textbox below that. Select Once as the Trigger.
      How to do it…
    5. Click on Add Actions. Select Add to Triggered Alerts from the Add Actions dropdown.
      How to do it…
    6. From the Severity dropdown, select Medium.
      How to do it…
    7. Click on Add Actions. Select Send email from the Add Actions dropdown.
      How to do it…
    8. Enter a valid e-mail address in the To box; this is where the alert will go to.
    9. In the Include section, check Link to Alert, Link to Results, Trigger Condition, and Trigger Time.
    10. In the Type section, select HTML & Plain Text.
      How to do it…
    11. Verify that the given details are entered correctly, and then click on Save.
  6. Click on View Alert. A summary screen should be displayed. Your first alert is now configured and set to run every hour in accordance with the schedule. The alert should trigger but might take some time, depending on when you scheduled the alert:
    How to do it…
  7. When the alert triggers, you should receive an e-mail similar to the following screenshot. Note that the link to the alert, the link to the results, and the trigger condition in the body of the e-mail are as specified when configuring.
    How to do it…

How it works…

In this recipe, you created a search to look for abnormal web page response times by creating an alert to trigger when the maximum response time in the last hour for a given web page was five times greater than the average response time for that page at the same time yesterday.

We selected to get the average from the same period yesterday, as the data might be abnormal today. You might wish to look back over a wider period, such as 7 days, to get a more accurate average. Alternatively, you could have a hardcoded threshold for the number of milliseconds within which a web page must respond that you could use instead.

Let's break the search down piece by piece:

Search fragment

Description

index=main sourcetype=access_combined

You should now be familiar with this search from the earlier recipes in this book.

| stats max(response) AS MAX by uri_path

Using the stats command with the max function, we first identify the maximum response time per web page or uri_path in the past hour. We name this field MAX.

| join uri_path [search earliest=-24h latest=-23h sourcetype=access_combined index=main | stats avg(response) AS AVG by uri_path]

Using the join command, we join the results of another search with our results. This search looks back over the same past hour period, but 24 hours earlier. The average response time is calculated for each page and given a field name of AVG.

| eval MAG=round(MAX/AVG)

Using the MAX and AVG fields, we calculate the order of magnitude that the MAX value is from the AVG value or how many times the MAX value is greater than the AVG value. This then gives us a MAG field.

After writing the search, you saved it as an alert and gave it the custom condition of search MAG>5. This essentially means that if a response time is seen in the past hour that is greater than five times the average for that particular web page, trigger the alert. Behind the scenes, Splunk essentially adds this condition to the search string, and if any results are returned then the alert is triggered.

There's more…

This alert used the scheduled alert type and is based on a historical search that runs periodically in accordance with the hourly schedule you set. In terms of alert actions, you selected to have an e-mail sent out each time the alert is triggered and also for the triggered alert to appear in the Alert manager inside Splunk.

Viewing triggered alerts in Splunk's Alert manager

Assuming that an alert has been triggered, you can view the alert in the Alert manager by clicking on the Activity drop-down menu at the top right-hand side of the screen and selecting Triggered Alerts.

Viewing triggered alerts in Splunk's Alert manager

Once the screen has loaded, you will see all the alerts that have triggered to date. There are not too many management actions you can take on the triggered alerts, although you can delete them from the list or select to edit the underlying alert if any tweaks are needed. Clicking on View Results for any alert will pop open a new screen with the results of the underlying search that powers the alert. Basic filtering and search capability are also provided for the triggered alerts in the list.

Viewing triggered alerts in Splunk's Alert manager

See also

You can refer to the following recipes for more information:

  • The Alerting on errors during checkout in real time recipe
  • The Alerting on abnormal user behavior recipe
  • The Alerting on failure and triggering a scripted response recipe
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.173.112