Identifying the top-referring websites

Our web access logs continue to give us great information about our website and the users visiting the site. Understanding where our users are coming from provides insight into potential sales leads and/or which marketing activities might be working better over others. For this information, we look for the referer_domain field value within the log data.

In this recipe, we will write a Splunk search to find the top-referring websites.

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server, with the sample data loaded from Chapter 1, Play Time – Getting Data In. You should be familiar with the Splunk search bar and the time range picker.

How to do it…

Follow the given steps to search for the top-referring websites:

  1. Log in to your Splunk server.
  2. Select the Search & Reporting application.
  3. Ensure that the time range picker is set to Last 24 hours and type the following search into the Splunk search bar. Then, click on Search or hit Enter.
    index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort – Referals
  4. Splunk will return a tabulated list ordered by the number of unique referrals each website has provided.
    How to do it…
  5. Save this search by clicking on Save As and then on Report. Give the report the name cp02_top_referring_websites and click on Save. On the next screen, click on Continue Editing to return to the search.

How it works…

Let's break down the search piece by piece:

Search fragment

Description

index=main sourcetype=access_combined

You should now be familiar with this search from the earlier recipes in this chapter.

| stats dc(clientip) AS Referals by referer_domain

Using the stats command, we apply the distinct count (dc) function to clientip to count the unique IP addresses by referer_domain and rename the generated count field to Referals.

| sort – Referals

Using the sort command, we sort by the number of referrals in the descending order.

There's more…

In this recipe, we did not use the top command, as this command only provides limited functionality. The stats command is far more powerful and has many available functions, including distinct count.

Searching for the top 10 using stats instead of top

Using the stats command in this recipe, we brought back all the websites present in our web access logs and then sorted them by the number of unique referrals. Should we want to only show the top 10, we can simply add the head command at the end of our search, as follows:

index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort - Referals | head 10

The head command keeps the first specified number of rows. In this case, as we have a descending sort, by keeping the first 10 rows, we are essentially keeping the top 10. Instead of using the head command, we can also use the limit parameter of the sort command, as follows:

index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort - Referals limit=10

Note

There is a great guide in the Splunk documentation to understand all the different functions for stats, chart, and timechart, available at http://docs.splunk.com/Documentation/Splunk/latest/Searc hReference/CommonStatsFunctions.

See also

Also refer to the following recipes for more information:

  • The Finding the most used web browsers recipe
  • The Charting web page response codes recipe
  • The Displaying web page response time statistics recipe
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.106.232