Creating a Python application to return unique IP addresses

The Splunk Python SDK was one of the first SDKs that Splunk developed and has since been used to integrate Splunk's ability to process and analyze large streams of data into custom applications. By leveraging the ability to integrate directly with your applications, you can see immediate results and fully leverage your operational intelligence capabilities.

In this recipe, you will learn how to use Splunk's Python SDK to create a custom Python application that will return unique IP addresses from the web server logs of our application.

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server, with the sample data loaded from Chapter 1, Play Time – Getting Data In. You should be familiar with navigating the Splunk user interface and using the Splunk search language. Some basic knowledge of Python is recommended. The Splunk Python SDK should also be downloaded and available on your Splunk Enterprise server. This recipe is expecting that the user has Python 2.7+ installed on their Splunk server. This example will not run under Python 3+.

Note

The Splunk Python SDK can be downloaded from http://dev.splunk.com.

How to do it...

Follow the steps in this recipe to create a Python application that returns unique IP addresses:

  1. Open a terminal window on your Splunk server.
  2. Execute the following command to export the Python SDK directory location as an environment variable. Update the value of PYTHONPATH with the actual path where you have installed the SDK:
    export PYTHONPATH=~/splunk-sdk-python
  3. Create a new file called uniqueip.py and open it for editing.
  4. To the uniqueip.py file, add the import statements that are needed to load the correct Splunk libraries that we will be using:
    import splunklib.client as client
    import splunklib.results as results
  5. Add constants to hold the values of the Splunk server we are connecting to and the credentials we are connecting with. You will likely need to change the Splunk username and password credentials from the default ones:
    HOST     = "localhost"
    PORT     = 8089
    USERNAME = "admin"
    PASSWORD = "changeme"
  6. Define the service instance we will be using to connect and communicate with our Splunk Enterprise server:
    service = client.connect(
        host=HOST,
        port=PORT,
        username=USERNAME,
        password=PASSWORD)
  7. Define a dictionary of search arguments that will be used with our search, which will modify its behavior:
    kwargs = {"earliest_time": "-15m",
              "latest_time": "now",
              "search_mode": "normal",
              "exec_mode": "blocking"}
  8. We add a variable to hold the search query we will be using to return our list of unique IP addresses. Any double quotes in the search query need to be escaped:
    searchquery = "search index=main sourcetype="access_combined" | stats count by clientip"
  9. Create the job request and print out to the console when it has been completed:
    job = service.jobs.create(searchquery, **kwargs)
    print "Job completed...printing results!
    "
  10. Create a reference to the search results, as follows:
    search_results = job.results()
  11. Add a ResultsReader, iterate through the results, and print out the IP address and the associated count:
    reader = results.ResultsReader(search_results)
    for result in reader:
        print "Result: %s => %s" % (result['clientip'],result['count'])
  12. The complete program code should look like this:
    import splunklib.client as client
    import splunklib.results as results
    
    HOST     = "localhost"
    PORT     = 8089
    USERNAME = "admin"
    PASSWORD = "changeme"
    
    service = client.connect(
        host=HOST,
        port=PORT,
        username=USERNAME,
        password=PASSWORD)
    
    kwargs = {"earliest_time": "-15m",
              "latest_time": "now",
              "search_mode": "normal",
              "exec_mode": "blocking"}
    
    searchquery = "search index=main sourcetype="access_combined" | stats count by clientip"
    
    job = service.jobs.create(searchquery, **kwargs)
    print "Job completed...printing results!
    "
    
    search_results = job.results()
    
    reader = results.ResultsReader(search_results)
    for result in reader:
        print "Result: %s => %s" % (result['clientip'],result['count'])
  13. To execute your program, run the following line:
    python uniqueip.py

The output of the program should look like this:

Result: 106.207.151.69 => 1
Result: 107.220.112.174 => 12
Result: 12.181.33.129 => 12
Result: 120.76.179.40 => 1
Result: 128.180.195.184 => 10

The program's output details the number of events in the web access logs by client IP over the last 15-minute time frame specified in the Python code.

How it works...

At the core of working with Splunk is the REST API. The REST API is used by Splunk to do everything from authenticating to searching to configuration management. As we have seen in another recipe of this chapter, we can interact with the REST API very easily with simple command-line tools.

Organizations that maintain their own line of business applications and are looking to integrate the operational intelligence they can get out of Splunk can do so by leveraging the SDK for the language that their application is written in. Splunk has created SDKs for many of the mainstream programming languages. Python was the first one developed and released since a large amount of Splunk is developed using Python.

The SDK is a wrapper around calls to the REST API and helps abstract the details by providing easy-to-use objects that can be interacted with. Most of the same REST endpoints available natively can be created as objects from the SDK.

As seen in the recipe, the majority of the functionality that is used is assisting with the creation of a connection and management of the authentication, creation of a search job, and processing of the results. There are also objects that can be created to manage users and roles, get data into Splunk, and work with saved searches.

There's more...

In this recipe, we began to scratch the surface of utilizing the Python SDK. Also, you saw how you can extend your own applications to leverage Splunk data. As with most of Splunk, there are many different ways to manipulate and view your data.

Paginating the results of your search

Leveraging the program created in this recipe, you can modify it as follows to paginate your results:

import splunklib.client as client
import splunklib.results as results

…

job = service.jobs.create(searchquery, **kwargs)
print "Job completed...printing results!
"

total  = job["resultCount"]
offset = 0;
count  = 10;

while (offset < int(total)):
    page_args = {"count": count,
                 "offset": offset}

    search_results = job.results(**page_args)
    reader = results.ResultsReader(search_results)
    for result in reader:
        print "Result: %s => %s" % (result['clientip'],result['count'])
    offset += count

See also

  • The Remotely querying Splunk's REST API for unique page views recipe
  • The Adding a calendar heatmap of product purchases recipe
  • The Creating a custom search command to format product names recipe
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.102.124