Univariate model

For the model that primarily uses only CPI data series, we will use a procedure in SAS called the Unobserved Components Model (UCM). While we are calling it a univariate model, we will end up using some components of the CPI time series as independent variables. Remember, we aren't using the nine internal variables that are available to us as part of the business problem. Those nine variables have been used in the multivariate regression model. The components we will be using are the irregular, trend, and seasonal components. We will also leverage some of the plots that can be produced as part of the UCM procedure.

The following is a univariate model code, using Proc UCM:

Proc UCM Data=Model; 
   Id Month Interval=Month; 
   Model CPI; 
   Irregular; 
   Level; 
   Slope Var = 0 Noest; 
   Season Length = 12 Type = Trig; 
   Estimate Back = 6 Plot = (loess panel cusum wn);  
   Forecast Back = 0 Lead = 24 Print = Forecasts Plot=(forecasts decomp); 
Run;

The trend has been specified using the level and the slope options. In Figure 5.25, we can see the preliminary estimates of the free parameters:

Figure 5.25: UCM procedure

As seen in Figure 5.26, both the Level and Slope (part of the trend component) are significant components in the model:

Figure 5.26: UCM procedure–final estimates

Let us evaluate the residual diagnostics, shown in Figure 5.27. In the histogram, the residuals seem to be normally distributed. In the Q-Q plot, the residuals are closer to the line and hence seem to be normally distributed. The ACF and PACF don't exhibit any violation of the whiteness assumption:

Figure 5.27: UCM procedure–residual diagnostics

In Figure 5.28, showing the residual white noise test, the first three lags correspond to the three components we have included in the model. As a standard, the white noise test is not done for the number of lags that equal the number of components used in the model. While the fourth lag is within the 0.05 p-value, from the fifth lag onwards, we can see that the residuals have a p-value greater than 0.05. Hence, no violation of the whiteness can be observed in the model:

Figure 5.28: UCM procedure–residual white noise test

There is a structural break in cumulative residuals, as seen in Figure 5.29. For a period of almost two years, the cumulative residuals are above the 95% confidence limit:

Figure 5.29: UCM procedure–cumulative residuals

We have tried to solve the business problem using the multivariate regression and the UCM model approach. Let us compare the forecasts generated for our validation period:

Forecasted Month	Forward Selection	Backward Selection	Maximize R	UCM	Observed Values
Oct 2017	106.4077	106.4077	106.3580	106.46	106.4
Nov 2017	106.4077	106.4077	106.3592	106.48	106.5
Dec 2017	106.3413	106.3413	106.2917	106.48	106.6
Jan 2018	106.2715	106.2715	106.2025	106.46	106.6
Feb 2018	106.2051	106.2051	106.1272	106.46	106.6
Mar 2018	106.2051	106.2051	106.1198	106.476667	106.7

Figure 5.30: Model forecasts versus observed

From the preceding table, we can observe that, in absolute terms, the UCM forecasts are closer to the observed values and also directionally right more number of times compared to other models. This does pose a dilemma for the bank's management. They are keen to leverage the internal data for forecasting. The publicly available CPI data seems to produce more accurate forecasts than the models using internal data. This is an aspect that the management team will have to consider while deciding to use internal data. From the nine variables that were used, it seems that only four are needed if the bank does go ahead and use models based on internal data.

Table of Contents for Univariate model

Create new playlist

Sign In

Sign Up

Table of Contents for
Univariate model