Predicting and trending

The following set of commands are used to predict the future values based on the historic values and pre-existing data sets and to create trends for better visualization of the data. Using the prediction technique, an error or issue that could arise in future can be predicted and then preventive measures can be taken. The following set of commands can be used to predict possible network outage, any device/server failures, and so on.

The predict command

The Splunk predict command can predict the future values of time series data. Time series is a set of values in the given dataset over time intervals. Examples of time series data can be data generated by machines as per their daily usage. This can be stock values of any script over the day, week, month, year, and so on. Basically, time series data can be any data that has data points over the time interval. Let's take an example. The Predict command can be used to predict the network condition of an LTE network for the next week based on the data of the current month or the number of visitors the website can probably get in the next week, based on the current dataset. Thus, this command can be used to predict future performance, requirements, outages, and so on.

Take a look at the following query block for the syntax:

… | predict
    Fieldname (AS NewFieldName)
    Algorithm = LL / LLP / LLT / LLB / LLP5
    Future_timespan = Timespan
    Period = Period_value
    Correlate = Fieldname

Only the fieldname for which new values are to be predicted is the compulsory parameter. The rest all are optional parameters. The parameter description of the predict command is as follows:

  • Fieldname: The name of the field for which values are to be predicted. The AS command followed by NewFieldName can be used to specify the custom name for the predicted field.
  • Algorithm: This parameter accepts the algorithm to be used to compute the predicted value. Depending on the dataset, the respective algorithm can be used. According to the Splunk documentation, the predict command uses Kalman Filter and its variant algorithms, namely LL, LLP, LLT, LLB, and LLP5:
    • Local Level (LB): Univariate model that does not consider trends and seasonality while predicting.
    • Seasonal Local Level (LLP): Univariate model with seasonality where periodicity is automatically computed.
    • Local Level Trend (LLT): Univariate model with trends but with no seasonality.
    • Bivariate Local Level (LLB): Bivariate model with no trends and no seasonality.
    • LLP5: Combination of LLP and LLT.
    • Future_timespan: This is a non-negative number (Timespan) that specifies the length of prediction into the future.
    • Period: This parameter defined the seasonal period for time series dataset. The value of this parameter is required only if the Algorithm parameter value is set to LLP or LLP5.
    • Correlate: Name of the field (Fieldname) to correlate with in the case of the LLB algorithm.

An example query for the predict query is shown as follows:

|inputcsvPrdiction.csv | eval _time=strptime(DateTime, "%d.%m.%Y %H:%M:%S") | timechart span=10m count(Value) AS Value | predict Value as PredictedValue algorithm=LL future_timespan=1

The preceding query should produce an output like the following screenshot:

The predict command

The preceding screenshot shows the tabular (statistical) output of a search result on Splunk web for the predict command. The following screenshot shows the same prediction in the visualization format:

The predict command

In the preceding screenshot, we used the predict command to predict the next 1 (Future_timespan=1) value of the Value fieldname and the algorithm used is Local Level (LL). We have used the strptime command to format the date and time (field name—DateTime) into Splunk understandable time (field name—_time) format. The predict line chart shows the prediction in a graphical format to understand and visualize the predicted result in a better format. Thus, the predict command can be used to predict the value of the factor specified. The predict command also predicts the lower and upper range of values for the predicted field. This command is very useful in predicting the demand of the product in future, given the historical data, the number of visitors, KPI values, and so on. Thus, it can be used to plan and be ready for future requirements in advance.

The trendline command

The trendline Splunk command is used to generate trends of the dataset for better understanding and visualization of the data. This command can be used to generate moving averages, which includes simple moving average, exponential moving average, and weighted moving average.

The syntax for the trendline command is as follows:

… |trendline (TrendType Period "("Fieldname")" AS NewFieldName)
      TrendType = ema / wma / sma

The description of the parameters of the preceding query is as follows:

  • TrendType: The trendline Splunk command, at present, supports only three types of trends, that is, Simple moving average (sma), exponential moving average (ema), and weighted moving average (wma).

    SMA and WMA are computed on a period over the sum of the most recent values. WMA concentrates more on the recent values compared to the past values.

    EMA is calculated using the following formula:

    MA(t) = alpha * EMA(t-1) + (1 - alpha) * field(t)

    where alpha = 2/ (period + 1) and field(t) is the current value of a field.

  • Period: The period over which the trend is to be computed. The value can range from 2 to 10000.
  • Field: The name of the field of which the trend is to be calculated is specified in this parameter. An optional AS clause can be used to specify the new field name (NewFieldName) where the results will be written.

Take a look at the following example query:

|inputcsvdatanse.csv | eval _time=strptime (date, "%e-%b-%y") | trendlinesma5(DAX) AS Trend_DAX

The output of the preceding query should look like the following:

The trendline command

The preceding screenshot shows the statistical output of the trendline command on the Splunk web console, whereas the following screenshot shows the same result in a visualization format:

The trendline command

In this example, we used stock index test data. Using the trendline command, the moving average of the DAX field is created as Trend_DAX. The trendline command can calculate different moving averages, such as simple, exponential, and weighted. In this example, we have calculated the simple moving average (sma) with period value as 5, and hence, in the example, you see sma5(DAX). In the visualization, the simple moving average for DAX superimposed with original DAX values can be seen. Thus, the trendline command can be used to calculate and visualize different moving averages of the specified field and proper inference can be made out from the dataset.

The x11 command

The Splunk command x11 is like the trendline command and is also used to create trends for the given time series data. The difference is the method that is based on the x11 algorithm to create the trend. The x11 command can be used to get the real trends of the data by removing seasonal fluctuations in the data.

Take a look at the syntax for the x11 command:

… | x11
    Add() / Mult()
    Period
    Field_name AS New_Field_name

The parameter description for the x11 command is as follows:

  • Add()/Mult(): This parameter with default value mult() is used to specify whether the computation is to be additive or multiplicative.
  • Period: This parameter can be used to specify the periodicity number of the data, that is, the period of data relative to the count of data points.
  • Field_name: The name of the field for which the seasonal trend is to be calculated using the x11 algorithm. This command can be followed by the AS command to specify the name of the new field, which will be shown in the result with the computed values of trends.

Take a look at the following example:

|inputcsvdatanse.csv | eval _time=strptime (date, "%e-%b-%y")| table _time DAX  |x11 DAX AS Trend_X11_DAX

The output generated should be like the one that follows:

The x11 command

The preceding screenshot shows the output of the x11 command in a tabular (statistical) format, whereas the following screenshot shows the same result in the form of visualization:

The x11 command

As explained earlier, the Splunk command x11 is used to compute the trends of the data by removing seasonality. The data used to showcase this example is the same as for the trendline command. The output result of both the trendline and x11 commands can be compared as both are commands to compute the trends. The visual difference in the graphs shows how the trends will look like when seasonality is removed while computing the trends.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.206.169