Splunk with R for analytics

We now have enough knowledge of Splunk's features and analytical capabilities; let's look at R and its capabilities. R is a statistically and graphically supported programming language for data analysis and data mining. R has extensive library support for statistical computing (linear/nonlinear modeling, clustering, classification, time series analysis, graphical plotting, predicting, forecasting, data mining, and so on).

Splunk, being a big data tool, can be integrated with R to leverage its advanced analytical capabilities for real-time insights. The Splunk app store had an app called R Project, but it is no longer available on the app store. The R Project app for Splunk can now be downloaded from GitHub (https://github.com/rfsp/r/).

The app can be installed on a Splunk instance like any other app downloaded from the Splunk app store. This app on Splunk exposes a new search command—r, which allows us to pass data from Splunk to the R-Engine for calculation and then pass results back to Splunk for further computation or visualization.

This R Project app makes seamless integration to run custom R scripts rights from the Splunk search console. This integration leverages real-time data analysis, data mining, and other statistical algorithm/packages of R to be directly used from Splunk. The following diagram shows how R Engine interacts with Splunk, which can be used by different stakeholders:

Splunk with R for analytics

When an R command is executed on Splunk search, the data from the Splunk pipeline is saved as a .csv file. This CSV file is taken as input in R and runs the scripts to create another .csv file, which is nothing but the result of the script. The resulting CSV file is loaded in Splunk, which is used to create visualization or generate insights from the processed data.

The setup

The following are the steps to be followed to integrate the R app with Splunk:

  1. Download and install the R app for Splunk from the GitHub link provided in the preceding section.
  2. The R tool also needs to be installed, and it can be downloaded from https://cran.r-project.org/bin/windows/base. There is no compatibility issue with any version of R. The example and illustration in this chapter can be completed using the 3.1.0 version of R.
  3. Once the R tool is installed, its installation directory path is required to be configured in the Splunk R app. Generally, the default path is Program FilesR folder in the drive where Windows OS is installed. The path in our example is C:Program FilesRR-3.1.0in.

    Note that the setup procedure is explained taking a Windows system as a reference, and the respective folder/path needs to be configured for the Linux/MAC OS.

  4. Now, in Splunk Web console, navigate to Apps | R Project | Setup. A page similar to the following screenshot will be visible. Key in the path of the R tool installation and click on Save. This is the one-time configuration required to set the path of the R tool in the Splunk app for R.
    The setup
  5. Now, once the path is configured, sections such as Examples, Scripts, and Packages of the app will be accessible to us.
  6. Packages can be installed by navigating to the Packages menu. There are two ways of installing packages from the R Project app on Splunk. The packages can either be manually downloaded from the CRAN repository or can be specified in the textbox. Depending on the option selected, the specified packages will get installed and be available for use in R scripts.
    The setup

Using R with Splunk

Once we have set up the R app, we can use the r command for computation or to run custom R scripts from the Splunk search itself. The Splunk app for R comes with various examples to explicate the usage of R-Engine from Splunk via the R Project app. The examples can be accessed from the Splunk Web console by navigating to R Project App | Examples. Following are the steps for running custom R scripts with Splunk:

  1. Run a search command that takes input from Splunk:
    index=_internal |r "output=colnames(input)"
    

    The preceding search command passes the result of index=_internal to R-Engine with the help of the input variable.

  2. Run the R function on Splunk:
    | r "
         gm_mean = function(x, na.rm=TRUE){
           exp(sum(log(x[x > 0]), na.rm=na.rm) / length(x))
         }
         data <- data.matrix(input);
         output <- apply(data, 2, gm_mean)
        "

R-Engine by Splunk can be accessed using the R command in Splunk. Splunk's R Project app provides an interface to upload custom scripts that can be used along with the R commands. The script can be uploaded by navigating in the Splunk Web console to R Project App | Scripts. This web page of Splunk provides users with an interface to upload the script.

The uploaded script can be used along with the R commands to pass the value from the Splunk pipeline to the R script as input parameters, and then the result from R-Engine can be shown on Splunk to create visualizations and generate insight.

Although the Splunk Machine Learning engine is evolving day by day and now has the computation capability of complex algorithms, the R Project can be used to integrate any pre-existing scripts/algorithms right away with Splunk. This helps in achieving enterprise integration of pre-existing tools, scripts, or technology with Splunk.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.204.140