In the previous chapter, you trained your first predictor on a household energy consumption dataset. You used the fully automated machine learning (AutoML) approach offered by default by Amazon Forecast, which let you obtain an accurate forecast without any ML or statistical knowledge about time series forecasting.
In this chapter, you will continue to work on the same datasets, but you will explore the flexibility that Amazon Forecast gives you when training a predictor. This will allow you to better understand when and how you can adjust your forecasting approach based on specificities in your dataset or specific domain knowledge you wish to leverage.
In this chapter, we're going to cover the following main topics:
No hands-on experience of a language such as Python or R is necessary to follow along with the content from this chapter. However, we highly recommend that you read this chapter while being connected to your own Amazon Web Services (AWS) account and open the Amazon Forecast console to run the different actions on your end.
To create an AWS account and log in to the Amazon Forecast console, you can refer to the Technical requirements section of Chapter 2, An Overview of Amazon Forecast.
The content in this chapter assumes that you have a dataset already ingested and ready to be used to train a new predictor. If you don't, I recommend that you follow the detailed process detailed in Chapter 3, Creating a Project and Ingesting your Data. You will need the Simple Storage Service (S3) location of your three datasets, which should look like this (replace <YOUR_BUCKET> with the name of the bucket of your choice):
You will also need the Identity and Access Management (IAM) role you created to let Amazon Forecast access your data from Amazon S3. The unique Amazon Resource Number (ARN) of this role should have the following format: arn:aws:iam::<ACCOUNT-NUMBER>:role/AmazonForecastEnergyConsumption. Here, you need to replace <ACCOUNT-NUMBER> with your AWS account number.
With this information, you are now ready to dive into the more advanced features offered by Amazon Forecast.
In Chapter 4, Training a Predictor with AutoML, we let Amazon Forecast make all the choices for us and left all the parameters at their default values, including the choice of algorithm. When you follow this path, Amazon Forecast applies every algorithm it knows on your dataset and selects the winning one by looking at which one achieves the best average weighted absolute percentage error (WAPE) metric in your backtest window (if you kept the default choice for the optimization metric to be used).
At the time of writing this chapter, Amazon Forecast knows about six algorithms. The AutoML process is great when you don't have a precise idea about the algorithm that will give the best result with your dataset. The AutoPredictor settings also give you the flexibility to experiment easily with an ensembling technique that will let Amazon Forecast devise the best combination of algorithms for each time series of your dataset. However, both these processes can be quite lengthy as they will, in effect, train and evaluate up to six models to select only one algorithm at the end (for AutoML) or a combination of these algorithms for each time series (for AutoPredictor).
Once you have some experience with Amazon Forecast, you may want to cut down on this training time (which will save you both time and training costs) and directly select the right algorithm among the ones available in the service.
When you open the Amazon Forecast console to create a new predictor and disable the AutoPredictor setting, you are presented with the following options in an Algorithm selection dropdown:
In this chapter, we will dive into each of the following algorithms:
For each of these, I will give you some theoretical background that will help you understand the behaviors these algorithms are best suited to capture. I will also give you some pointers about when your use case will be a good fit for any given algorithm.
ETS stands for Error, Trend, and Seasonality: ETS is part of the exponential smoothing family of statistical algorithms. These algorithms work well when the number of time series is small (fewer than 100).
The idea behind exponential smoothing is to use a weighted average of all the previous values in a given time series, to forecast future values. This approach approximates a given time series by capturing the most important patterns and leaving out the finer-scale structures or rapid changes that may be less relevant for forecasting future values.
The ETS implementation of Amazon Forecast performs exponential smoothing accounting for trends and seasonality in your time series data by leveraging several layers of smoothing.
Trend and seasonal components can be additive or multiplicative by nature: if we consume 50% more electricity during winter, the seasonality is multiplicative. On the other hand, if we sell 1,000 more books during Black Friday, the seasonality will be additive.
Let's call yt the raw values of our initial time series and let's decompose the different smoothing layers in the case of additive trend and seasonality, as follows:
When a trend component is present in our time series, the previous approach does not do well: to counter this, we add a trend smoothing parameter, generally called , that we apply in a similar fashion on the difference between successive timesteps on our series. is also set between 0 and 1. This yields to the double exponential smoothing, and the point forecast k periods ahead can be derived from both the series level and the series slope at the previous timestep Lt and Bt, as illustrated in the following formula:
If our signal also contains additional high-frequency (HF) signals such as a seasonality component, we need to add a third exponential smoothing parameter, usually called , and call the seasonal smoothing parameter (also set between 0 and 1 as and ). If m denotes the number of seasons to capture our new point forecast k periods ahead, this can now be derived with the triple exponential smoothing, using the series level Lt, the series slope Bt, and the seasonal component St at the previous timestep, as follows:
The ETS algorithm is also able to apply a damping parameter when there is a trend in a time series. When computing several periods ahead (in other words, when predicting a forecast horizon of several timesteps), we use the same slope as determined at the end of the historical time series for each forecast period. For long forecast periods, this may seem unrealistic, and we might want to dampen the detected trend as the forecast horizon increases. Accounting for this damping parameter (strictly set between and 1), the previous equations become these:
When Amazon Forecast trains a model with ETS, it uses the default parameters of the ets function available in the R forecast package. This means that all the ets parameters are automatically estimated. In addition, Amazon Forecast also automatically selects the model type according to the following criteria:
All of these choices are abstracted away from you: multiple models are run behind the scenes and the best one is selected.
Amazon Forecast will create a local model for each time series available in your dataset. In the energy consumption example we have been running, each household electricity consumption would be modeled as a single ETS model.
If you want to get more in-depth details about the different exponential smoothing methods (including details of the equations when considering damping, and multiplicative trends and seasonality), how all these parameters are automatically computed, and which variation is automatically selected, I recommend you deep dive into this paper by Rob Hyndman et. al: A state space framework for automatic forecasting using exponential smoothing methods. Here is a persistent link to this paper: https://doi.org/10.1016/S0169-2070(01)00110-8.
With ETS, ARIMA is another very well-known family of flexible statistical forecasting models. ARIMA is a generalization of the ARMA family of models, where ARMA describes a time series with two polynomials: one for autoregression and the other for the MA.
Although all these family of models involves inputting many parameters and computing coefficients, Amazon Forecast leverages the auto.arima method from the forecast R package available on the Comprehensive R Archive Network (CRAN). This method conducts a search over the possible values of the different ARIMA parameters to identify the best model of this family.
Let's call yt the raw values of our initial time series and let's decompose the different steps needed to estimate an ARIMA model, as follows:
When Amazon Forecast trains a model with ETS, it uses the default parameters of the arima function available in the R forecast package. Amazon Forecast also uses the auto.arima method, which conducts a search of the parameter space to automatically find the best ARIMA parameters for your dataset. This includes exploring a range of values for the following:
This means that all these arima parameters are automatically estimated and that the model type is also automatically selected: all of these choices are abstracted away from you as multiple models are run behind the scenes, and the best one is selected.
Amazon Forecast will create a local model for each time series available in your dataset. In the energy consumption example we have been running, each household electricity consumption would be modeled as a single ARIMA model.
If you want to get more in-depth details about the automatic parameter choice from ARIMA, I recommend that you deep dive into this paper by Hyndman and Khandakar: Automatic Time Series Forecasting: The forecast Package for R. Here is a persistent link to this paper: https://doi.org/10.18637/jss.v027.i03.
A simple forecaster is an algorithm that uses one of the past observed values as the forecast for the next timestep, as outlined here:
NPTS falls into this class of simple forecaster.
As just mentioned, NPTS falls in the simple forecasters' category. However, it does not use a fixed time index as the last value: rather, it samples randomly one of the past values to generate a prediction. By sampling multiple times, NPTS is able to generate a predictive distribution that Amazon Forecast can use to compute prediction intervals.
Let's call the raw values of our initial time series, with t ranging from 0 to T – 1 and T being the time step for which we want to deliver a prediction , as follows:
And the time index t is actually sampled from a sampling distribution qT. NPTS uses the following sampling distribution:
Here, is a kernel weight hyperparameter that should be adjusted based on the data and helps you control the amount of decay in the weights. This allows NPTS to sample recent past values with a higher probability than observations from a distant past.
This hyperparameter can take values from 0 to infinitum. Here are the meanings of these extreme values:
Once you have generated a prediction for the next step, you can include this prediction in your past datasets and generate subsequent predictions while giving the ability to NPTS to sample past predicted data.
If you want to get in-depth details about this algorithm and, more generally, about intermittent data forecasting, you can dive deeper into the following articles:
Let's now have a look at how this algorithm has been implemented in Amazon Forecast.
Here are the key parameters the NPTS algorithm lets you select:
When a seasonal model is used, you also have the ability to request Amazon Forecast to automatically provide and use seasonal features that depend on the forecast granularity by setting the use_default_time_features parameter to True. Let's say, for instance, that your data is available at the hourly level: if this parameter is set to True and you want to give a prediction for 3 p.m., then the NPTS algorithm will only sample past observations that also happened at 3 p.m.
Prophet is a forecasting algorithm that was open sourced by Facebook in February 2017.
The Prophet algorithm is similar to generalized additive models with four components, as follows:
Here, the following applies:
By default, Prophet uses N = 10 for yearly seasonality and N = 3 for weekly seasonality.
If you want to know more about the theoretical details of Prophet, you can read through the original paper published by Facebook available at the following link: https://peerj.com/preprints/3190/.
Amazon Forecast uses the Prophet class of the Python implementation of Prophet using all the default parameters.
Amazon Forecast will create a local model for each time series available in your dataset. In the energy consumption example we have been running, each household electricity consumption would be modeled as a single Prophet model.
If you want to get more in-depth details about the impact of the default parameter choice from Prophet, I recommend that you read through the comprehensive Prophet documentation available at the following link: https://facebook.github.io/prophet/.
Classical methods we have been reviewing—such as ARIMA, ETS, or NPTS—fit a single model to every single time series provided in a dataset. They then use each of these models to provide a prediction for the desired time series. In some applications, you may have hundreds or thousands of similar time series that evolve in parallel. This will be the case of the number of units sold for each product on an e-commerce website or the electricity consumption of every household served by an energy provider. For these cases, leveraging global models that learn from multiple time series jointly may provide more accurate forecasting. This is the approach taken by DeepAR+.
DeepAR+ is a supervised algorithm for univariate time series forecasting. It uses recurrent neural networks (RNNs) and a large number of time series to train a single global model. At training time, for each time series ingested into the target time series dataset, DeepAR+ generates multiple sequences (time series snippets) with different starting points in the original time series to improve the learning capacity of the model.
The underlying theoretical details are beyond the scope of this book: if you want to know more about the theoretical details of the DeepAR algorithm, you can do the following:
The key parameters the DeepAR+ algorithm lets you select are listed here:
DeepAR+ also lets you customize the learning parameters, as follows:
The architecture of the model can be customized with the following two parameters:
CNN-QR leverages a similar approach to DeepAR+, as it also builds a global model, learning from multiple time series jointly to provide more accurate forecastings.
CNN-QR is a sequence-to-sequence (Seq2Seq) model that uses a large number of time series to train a single global model. In a nutshell, this type of model tests how well a prediction reconstructs a decoding sequence conditioned on an encoding sequence. It uses a quantile decoder to make multi-horizon probabilistic predictions.
The underlying theoretical details are beyond the scope of this book, but if you want to know more about some theoretical work leveraged by algorithms such as CNN-QR, you can read through the following sources:
The key parameters the DeepAR+ algorithm lets you select are these:
The Amazon Forecast documentation includes a table to help you compare the different built-in algorithms used by this service. I recommend you refer yourself to it, as new approaches and algorithms may have been included in the service since the time of writing this book. You can find it at this link: https://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-choosing-recipes.html.
Based on the algorithms available at the time of writing, I expanded this table to add a few other relevant criteria. The following table summarizes the best algorithm options you can leverage, depending on the characteristics of your time series datasets:
Let's now expand beyond this summary table and dive into some specifics of each of these algorithms, as follows:
Now that you know how to select an algorithm suitable for your business needs, let's have a look at how you can customize each of them and override the default parameters we left as is when you trained your first predictor in the previous chapter.
Training an ML model is a process that consists of finding parameters that will help the model to better deal with real data. When you train your own model without using a managed service such as Amazon Forecast, you can encounter three types of parameters, as follows:
Amazon Forecast endorses the responsibility to manage the model selection parameters and coefficients for you. In the AutoML process, it also uses good default values for the hyperparameters of each algorithm and applies HPO for the DL algorithms (DeepAR+ and CNN-QR) to make things easier for you.
When you manually select an algorithm that can be tuned, you can also enable an HPO process to fine-tune your model and reach the same performance as with AutoML, but without the need to train a model with each of the available algorithms.
In learning algorithms such as the ones leveraged by Amazon Forecast, HPO is the process used to choose the optimal values of the parameters that control the learning process.
Several approaches can be used, such as grid search (which is simply a parameter sweep) or random search (selecting hyperparameter combinations randomly). Amazon Forecast uses Bayesian optimization.
The theoretical background of such an optimization algorithm is beyond the scope of this book, but simply speaking, HPO matches a given combination of parameter values with the learning metrics (such as WAPE, weighted quantile loss (wQL), or root mean square error (RMSE), as mentioned in Chapter 4, Training a Predictor with AutoML). This matching is seen as a function by Bayesian optimization, which tries to build a probabilistic model for it. The Bayesian process iteratively updates hyperparameter configuration and gathers information about the location of the optimum values of the matching functions modeled.
To learn more about HPO algorithms, you can read through the following article: https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf.
If you want to read more about Bayesian optimization and why it is better than grid search or random search, the following article will get you started:
Practical Bayesian Optimization of ML Algorithms (https://arxiv.org/abs/1206.2944)
Let's now train a new predictor, but explicitly using HPO.
In this section, we are going to train a second predictor, using the winning algorithm found in Chapter 4, Training a Predictor with AutoML. As a reminder, the winning algorithm was CNN-QR and we achieved the following performance:
To train a new predictor, log in to the AWS console, navigate to the Amazon Forecast console, and click on the View dataset groups button. There, you can click on the name of the dataset group you created in Chapter 3, Creating a Project and Ingesting your Data. Check out this chapter if you need a reminder on how to navigate through the Amazon Forecast console. You should now be on the dashboard of your dataset group, where you will click on the Train predictor button, as illustrated in the following screenshot:
Once you click the Train predictor button, the following screen will open up to let you customize your predictor settings:
For your Predictor settings options, we are going to fill in the following parameters:
In the next section, disable the AutoPredictor toggle and select CNN-QR in the Algorithm selection dropdown.
Click on Perform hyperparameter optimization (HPO). A new section called HPO Config appears, where you can decide the range of values you want Amazon Forecast to explore for the different tunable hyperparameters of CNN-QR. This configuration must be written in JavaScript Object Notation (JSON) format.
Let's enter the following JSON document into the HPO Config widget:
{
"ParameterRanges": {
"IntegerParameterRanges": [
{
"MaxValue": 360,
"MinValue": 60,
"ScalingType": "Auto",
"Name": "context_length"
}
]
}
}
With this configuration, we are telling Amazon Forecast that we want it to explore a hyperparameter space defined by these potential values:
In the Training parameters section, we will leave the epochs number at 100 (a similar number to the default value used by the AutoML procedure) for comparison purposes, as follows:
{
"epochs": "100"
[...]
}
Leave all the other parameters at their default values and click Create to start the training of our new predictor with the HPO procedure. Our predictor will be created with the Create in progress status, as illustrated in the following screenshot:
After a while (a little less than 2 hours, in my case), our new predictor is trained and ready to be used, as can be seen here:
In your case, the mean-absolute percentage error (MAPE) metric may be slightly different but will be close to the number you see here. You can see that in my case, I obtained a MAPE value of 0.2281, which is extremely close to the value obtained in the AutoML process (which was 0.2254). As explained in the What is HPO? section, the slight differences come from the random nature of the Bayesian procedure.
You now know how to configure HPO to your predictor training. However, not all algorithms can be fine-tuned using this process, and not all parameters can be defined as targets with ranges to explore. In the next section, we will look at the options you have for each algorithm.
When you know which algorithm will give the best results for your datasets, you may want to ensure that you get the most out of this algorithm by using HPO. Let's have a look at what is feasible for each algorithm, as follows:
In conclusion, HPO is a process that is available for DL models (DeepAR+ and CNN-QR). The other algorithms do not take advantage of HPO. Now that you know how to override the default hyperparameters of the algorithms leveraged by Amazon Forecast, we are going to look at how you can use a stronger backtesting strategy to improve the quality of your forecasts.
In ML, backtesting is a technique used in forecasting to provide the learning process with two datasets, as follows:
As a reminder, here are the different elements of backtesting in Amazon Forecast, as outlined in Chapter 4, Training a Predictor with AutoML:
When dealing with time series data, the split must mainly be done on the temporal axis (and, to a lesser extent, on the item population) to prevent any data leak from the past data to the future. This is paramount to make your model robust enough for when it will have to deal with actual production data.
By default, when you leave the default parameter as is (when selecting AutoML or when selecting an algorithm manually), Amazon Forecast uses one backtest window with a length equal to the forecast horizon. However, it is a good practice to provide multiple start points to remove any dependency of your predictions on the start date of your forecast. When you select multiple backtest windows, Amazon Forecast computes the evaluation metrics of your models for each window and averages the results. Multiple backtest windows also help deal with a different level of variability in each period, as can be seen here:
In the previous screenshot, you can see what Amazon Forecast will do when you select five backtest windows that have the same length as the forecast horizon. This is essentially what happens:
Configuring your backtesting strategy happens in the Predictor settings section when you wish to create a new predictor without using the AutoPredictor settings, as illustrated in the following screenshot:
After you make your algorithm selection, you can choose from the following options:
Let's create a new predictor with the following parameters:
Leave all the other parameters at their default values and click Start to create a new predictor with these settings. After an hour or so of training, you can explore the accuracy metric of this new predictor, as illustrated in the following screenshot:
You will notice that Amazon Forecast computes the accuracy metrics for each backtest window and also gives you the average of these metrics over five windows. You may note some fluctuations of the metrics, depending on the backtest window: this can be a good indicator of the variability of your time series across different time ranges, and you may run some more thorough investigation of your data over these different periods to try to understand the causes of these fluctuations.
You now know how to customize your backtesting strategy to make your training more robust. We are now going to look at how you can customize the features of your dataset by adding features provided by Amazon Forecast (namely, holidays and weather data).
At the time of writing this book, Amazon Forecast includes two built-in datasets that are made available as engineered features that you can leverage as supplementary features: Holidays and the Weather index.
This supplementary feature includes a dataset of national holidays for 66 countries. You can enable this feature when you create a new predictor: on the predictor creation page, scroll down to the optional Supplementary features section and toggle on the Enable holidays button, as illustrated in the following screenshot:
Once enabled, a drop-down list appears, to let you select the country for which you want to enable holidays. Note that you can only select one country: your whole dataset must pertain to this country. If you have several countries in your dataset and wish to take different holidays into account, you will have to split your dataset by country and train a predictor on each of them.
When this parameter is selected, Amazon Forecast will train a model with and without this parameter: the best configuration will be kept based on the performance metric of your model.
This supplementary feature includes 2 years of historical weather data and 2 weeks of future weather information for 7 regions covering the whole world, as follows:
To leverage weather data in Amazon Forecast, you need to perform the following actions:
Once you have a dataset with data from a single region (US, Canada, Europe, Central America, Africa & Middle East, or South America), you must add a location feature to your dataset. This feature can be one of the following:
Our household electricity consumption dataset is from London and the coordinates of London are respectively 51.5073509 (latitude) and -0.1277583 (longitude). If we wanted to add weather data to our dataset, we could add a column called location to our dataset and set its value to 51.5073509_-0.1277583 for every row (as all our households are located in this city).
Important Tip
As mentioned previously, the Weather index is only available for the past 2 years. If you want to get some practice with this feature, make sure the availability of weather data is compatible with the recording time of your time series data. If your time series data includes data points before July 1, 2018, the weather_index parameter will be disabled in the user interface (UI).
When ingesting a target time series data into a dataset group, you will have the opportunity to select a location type attribute. Follow these next steps:
You can then continue with the ingestion process, as explained in Chapter 3, Creating a Project and Ingesting your Data. If you have other datasets, you can also ingest them in the related time series data and item metadata dataset. You can then train a new predictor based on this dataset group.
Once you have a dataset group with location data, you can enable the Weather index feature while creating a new predictor based on this dataset, as follows:
When you're ready to train a predictor, click on the Create button at the bottom of the prediction creation screen to start training a predictor with weather data.
When this parameter is selected, Amazon Forecast will train a model with and without this parameter: the best configuration for each time series will be kept based on the performance metric of your model. In other words, if supplementing weather data to a given time series does not improve accuracy during backtesting, Amazon Forecast will disable the weather feature for this particular time series.
You have now a good idea of how to take weather data and holiday data into account in your forecasting strategy. In the next chapter, you are going to see how you can ask Amazon Forecast to preprocess the features you provide in your own dataset.
Amazon Forecast lets you customize the way you can transform the input datasets by filling in missing values. The presence of missing values in raw data is very common and has a deep impact on the quality of your forecasting model. Indeed, each time a value is missing in your target or related time series data, the true observation is not available to assess the real distribution of historical data.
Although there can be multiple reasons why values are missing, the featurization pipeline offered by Amazon Forecast assumes that you are not able to fill in the values based on your domain expertise and that missing values are actually present in the raw data you ingested into the service. For instance, if we plot the energy consumption of the household with the identifier (ID) MAC002200, we can see that some values are missing at the end of the dataset, as shown in the following screenshot:
As we are dealing with the energy consumption of a household, this behavior is probably linked to a household that left this particular house. Let's see how you can configure the behavior of Amazon Forecast to deal with this type of situation.
Configuring how to deal with missing values happens in the Advanced configuration settings of the Predictor details section when you wish to create a new predictor.
After logging in to the AWS console, navigate to the Amazon Forecast console and click on the View dataset groups button. There, you can click on the name of the dataset group you created in Chapter 3, Creating a Project and Ingesting your Data. You should now be on the dashboard of your dataset group, where you can click on Train predictor. Scroll to the bottom of the Predictor details section and click on Advanced configurations to unfold additional configurations. In this section, you will find the Hyperparameter optimization configuration, the Training parameter configuration (when available for the selected algorithm), and the Featurizations section, as illustrated in the following screenshot:
The featurization is configured with this small JSON snippet, and you will have a similar section for every attribute you want to fill in missing values for.
By default, Amazon Forecast only fills in missing values in the target time series data and does not apply any transformation to the related time series data. There are three parameters you can configure, as follows:
The following screenshot illustrates these different handling strategies Amazon Forecast can leverage for filling missing values:
The global start date is the earliest start date of all the items (illustrated in the preceding screenshot by the top two lines) present in your dataset, while the global end date is the last end date of all the items augmented by the forecast horizon.
The following table illustrates the possible values these parameters can take, depending on the dataset:
Let's have a look at these parameters in more detail, as follows:
There are no default values when applying filling methods to related time series (as you can have several series with different expected behavior). Values in bold in the preceding table are the default values applied to the target time series dataset.
When you select a value for one of the filling parameters, you have to define the desired value as a real value with an additional parameter obtained by adding _value to the name of the parameter, as follows:
In this section, you discovered how you can rely on Amazon Forecast to preprocess your features and maintain the consistent data quality necessary to deliver robust and accurate forecasts. In the next section, you are going to dive deeper into how you can put the probabilistic aspect of Amazon Forecast to work to suit your own business needs.
Amazon Forecast generates probabilistic forecasts at different quantiles, giving you prediction intervals over mere point forecasts. Prediction quantiles (or intervals) let Amazon Forecast express the uncertainty of each prediction and give you more information to include in the decision-making process that is linked to your forecast exercise.
As you have seen earlier in this chapter, Amazon Forecast can leverage different forecasting algorithms: each of these algorithms has a different way to estimate probability distributions. For more details about the theoretical background behind probabilistic forecasting, you can refer to the following papers:
Let's now see how you can configure different forecast types when creating a new predictor.
Configuring the forecast types you want Amazon Forecast to generate happens on the Predictor details section when you wish to create a new predictor. By default, the quantiles selected are 10% (p10), 50% (the median, p50) and 90% (p90). When configuring a predictor, the desired quantiles are called forecast type and you can choose up to five of them, including any percentile ranging from 0.01 to 0.99 (with an increment of 0.01). You can also select mean as a forecast type.
Important Note
CNN-QR cannot generate a mean forecast, as this type of algorithm directly generates predictions for a particular quantile: when selecting mean as a forecast type, CNN-QR will fall back to computing the median (p50).
In the following screenshot, I configured my forecast types to request Amazon Forecast to generate the mean forecast and three quantiles (p10, p50, and p99):
You can click on Remove to remove a forecast and request up to five forecast types by clicking on the Add new forecast type button.
Important Note
Amazon Forecast bills you for each time series forecast is generated: each forecast type counts for a billed item and they will always bill you for a minimum of three quantiles, even if you only intend to use one. It is hence highly recommended to customize the default quantiles to suit your business needs and benefit from the probabilistic capability of Amazon Forecast.
Amazon Forecast will compute a quantile loss metric for each requested quantile. In the following screenshot, you can see the default wQL metrics computed for the default quantiles generated by the service: wQL[0.1] is the quantile loss for the p10 forecast type, wQL[0.5] is for the median, and so on:
Let's now see how to choose the right quantiles depending on the type of decision you are expecting to make based on these predictions.
Choosing a forecast type to suit your business need is a trade-off between the cost of over-forecasting (generating higher capital cost or higher inventory) and the cost of under-forecasting (missing revenue because a product is not available when a customer wishes to order it, or because you cannot produce finished goods to meet the actual market demand).
In Chapter 4, Training a Predictor with AutoML, you met the weighted quantile loss (wQL) metric. As a reminder, computing this metric for a given quantile can be done with the following formula:
Here, the following applies:
If you are building a sound vaccine strategy and want to achieve global immunity as fast as possible, you must meet your population demand at all costs (meaning that doses must be available when someone comes to get their shot). To meet such imperative demand, a p99 forecast may be very useful: this forecast type expects the true value to be lower than the predicted value 99% of the time. If we use = 0.99 in the previous formula, we end up with the following:
Let's have a look at another example, as follows:
As you can see, combining different forecast types gives you the ability to adjust your strategy depending on fluctuations over the year.
In this chapter, you discovered the many possibilities Amazon Forecast gives you to customize your predictor training to your datasets. You learned how to choose the best algorithm to fit your problem and how to customize different parameters (quantiles, the missing values' filling strategy, and supplementary features usage) to try to improve your forecasting models.
The AutoML capability of Amazon Forecast is a key differentiator when dealing with a new business case or a new dataset. It gives you good directions and reliable results with a fast turnaround. However, achieving higher accuracy to meet your business needs means that you must sometimes be able to override Amazon Forecast decisions by orienting its choice of algorithms, deciding how to process the features of your dataset, or simply requesting a different set of outputs by selecting forecast types that match the way your decision process is run from a business perspective.
In the next chapter, you will use a trained predictor to generate new forecasts (in ML terms, we will use new data to run inferences on a trained model).
3.21.113.147