Our web access logs continue to give us great information about our website and the users visiting the site. Understanding where our users are coming from provides insight into potential sales leads and/or which marketing activities might be working better over others. For this information, we look for the referer_domain
field value within the log data.
In this recipe, we will write a Splunk search to find the top-referring websites.
To step through this recipe, you will need a running Splunk Enterprise server, with the sample data loaded from Chapter 1, Play Time – Getting Data In. You should be familiar with the Splunk search bar and the time range picker.
Follow the given steps to search for the top-referring websites:
index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort – Referals
cp02_top_referring_websites
and click on Save. On the next screen, click on Continue Editing to return to the search.Let's break down the search piece by piece:
Search fragment |
Description |
---|---|
|
You should now be familiar with this search from the earlier recipes in this chapter. |
|
Using the |
|
Using the |
In this recipe, we did not use the top
command, as this command only provides limited functionality. The stats
command is far more powerful and has many available functions, including distinct count.
Using the stats
command in this recipe, we brought back all the websites present in our web access logs and then sorted them by the number of unique referrals. Should we want to only show the top 10, we can simply add the head
command at the end of our search, as follows:
index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort - Referals | head 10
The head
command keeps the first specified number of rows. In this case, as we have a descending sort, by keeping the first 10 rows, we are essentially keeping the top 10. Instead of using the head
command, we can also use the limit
parameter of the sort
command, as follows:
index=main sourcetype=access_combined | stats dc(clientip) AS Referals by referer_domain | sort - Referals limit=10
There is a great guide in the Splunk documentation to understand all the different functions for stats
, chart
, and timechart
, available at http://docs.splunk.com/Documentation/Splunk/latest/Searc hReference/CommonStatsFunctions.
18.223.106.232