Creating the ML job for forecasting

This example dataset goes from January 31st 2017 to March 11th 2017. Despite being from the past, we can contrive a scenario in which we are pretending to be in that time frame and say that today's date is March 1st, 2017. We therefore want to have an ML job analyze the data between January 31st and today, and then use ML to forecast that data ten days into the future.

If your Kibana time zone is set to your local time, the dates in this chapter may look slightly different as the screenshots were taken with a version of Kibana that's been set to the Eastern time zone of the United States.

We can then have the ML job analyze the rest of the data in the index (from March 1st through March 11th) and we can see how close the forecast tracked the actual dataset. Let's begin!

The first step is to create a Single Metric job that uses the Sum aggregation function on the #amount field with a Bucket span of 15m:

However, we don't want to analyze all of the data; we want to stop short on March 1st. Therefore, in the top right, change the end date of the Kibana date picker to be March 1st:

With that date modification made, you can name the job a_forecast_example and click the Create Job button:

Once the job has run, we can see the results preview in the UI:

To access the ability to forecast, we need to click the View Results button, which will take us to the Single Metric Viewer. In the Single Metric Viewer, we can see the overall dataset and can appreciate the shape and complexity of the way this data behaves; there are both daily and weekly periodic components, as well as a gradual positive slope/trend that causes the data to drift up over time:

Remember, despite the fact that we may only be interested in forecasting on this data, the ML job will still point out anomalies throughout the data's history, but we can simply ignore them.

To invoke a forecast on this data, click the Forecast button and in the dialog box, enter a duration of 10 days (10d):

There is currently a hard eight-week limit on the duration (this is likely to be extended in future versions of ML). Also, you should note that you should not attempt to ask for a forecasting duration that is longer than the duration of the data that the ML job has analyzed. In other words, don't ask for a two-week forecast if the ML job has only ever seen one week of data. Lastly, supply enough consistent data to learn about the principal patterns. For example, a minimum of three cycles of a periodic pattern is used to achieve the best possible predictions.

Clicking the Run button will invoke the forecast request, which will run in the background, but once finished will display the results of the forecast to extend over the time period of interest:

The shaded area around the forecast/predicted zone is the 95th percentile confidence interval. In other words, ML has estimated that there is a 95% chance that the future values will be within this range (and likewise, only a 2.5% chance that the future values will be either above or below the confidence interval). The 95th percentile range is currently a fixed value and is not yet settable by the user.

Now that we have the ability to create simple forecasts from the UI, let's explore the results of the forecast in more depth before moving to a more complicated example.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.240.222