While most of the data you will index with Splunk will be collected in real time, there might be instances where you have a set of data that you would like to put into Splunk, either to backfill some missing or incomplete data, or just to take advantage of its searching and reporting tools.
This recipe will show you how to perform one-time bulk loads of data from files located on the Splunk server. We will also use this recipe to load the data samples that will be used throughout the subsequent chapters as we build our operational intelligence app in Splunk.
There are two files that make up our sample data. The first is access_log
, which represents the data from our web layer and is modeled on an Apache web server. The second file is app_log
, which represents the data from our application layer and is modeled on log4j
log data from our custom middleware application.
To step through this recipe, you will need a running Splunk server and you should have a copy of the sample data generation app (OpsDataGen.spl
) for this book.
Follow the given steps to load the sample data generator on your system:
OpsDataGen.spl
file on your computer, and then, click on the Upload button to install the application.OpsDataGen
app in the list of apps.$SPLUNK_HOME/etc/apps/OpsDataGen/bin/AppGen.path
$SPLUNK_HOMEetcappsOpsDataGeninAppGen-win.path
The following screenshot displays both the Windows and Linux inputs that are available after installing the OpsDataGen app. It also displays where to click to enable the correct one based on the operating system Splunk is installed on.
$SPLUNK_HOME/etc/apps/OpsDataGen/data/access_log
and $SPLUNK_HOME/etc/apps/OpsDataGen/data/app_log
$SPLUNK_HOMEetcappsOpsDataGendataaccess_log
and $SPLUNK_HOMEetcappsOpsDataGendataapp_log
The following screenshot displays both the Windows and Linux inputs that are available after installing the OpsDataGen app. It also displays where to click to enable the correct one based on the operating system Splunk is installed on.
index=main sourcetype=log4j OR sourcetype=access_combined
In this case, you installed a Splunk application that leverages a scripted input. The script we wrote generates data for two source types. The access_combined
source type contains sample web access logs, and the log4j
source type contains application logs. These data sources will be used throughout the recipes in the book. Applications will also be discussed in more detail later on.
18.220.126.5