Autoregressive Integrated Moving Average models

We can achieve similar results with Autoregressive Integrated Moving Average (ARIMA) models. To predict future values of a time-series, we usually have to stationarize it first, which means that the data has a constant mean, variance, and autocorrelation over time. In the past two sections, we used seasonal decomposition and the Holt-Winters filter to achieve this. Now let's see how the generalized version of the Autoregressive Moving Average (ARMA) model can help with this data transformation.

ARIMA(p, d, q) actually includes three models with three non-negative integer parameters:

  • p refers to the autoregressive part of the model
  • d refers to the integrated part
  • q refers to the moving average parts

As ARIMA also includes an integrated (differencing) part over ARMA, it can deal with non-stationary time-series as well, as they naturally become stationary after differencing—in other words, when the d parameter is larger than zero.

Traditionally, choosing the best ARIMA model for a time-series is required to build multiple models with a variety of parameters and compare model fits. On the other hand, the forecast package comes with a very useful function that can select the best fitting ARIMA model for a time-series by running unit root tests and minimizing the maximum-likelihood (ML) and the Akaike Information Criterion (AIC) of the models:

> auto.arima(nts)
Series: ts 
ARIMA(3,0,0)(2,0,0)[7] with non-zero mean 

Coefficients:
         ar1      ar2     ar3    sar1    sar2  intercept
      0.3205  -0.1199  0.3098  0.2221  0.1637   621.8188
s.e.  0.0506   0.0538  0.0538  0.0543  0.0540     8.7260

sigma^2 estimated as 2626:  log likelihood=-1955.45
AIC=3924.9   AICc=3925.21   BIC=3952.2

It seems that the AR(3) model has the highest AIC with AR(2) seasonal effects. But checking the manual of auto.arima reveals that the information criteria used for the model selection were approximated due to the large number (more than 100) of observations. Re-running the algorithm and disabling approximation returns a different model:

> auto.arima(nts, approximation = FALSE)
Series: ts 
ARIMA(0,0,4)(2,0,0)[7] with non-zero mean 

Coefficients:
         ma1      ma2     ma3     ma4    sar1    sar2  intercept
      0.3257  -0.0311  0.2211  0.2364  0.2801  0.1392   621.9295
s.e.  0.0531   0.0531  0.0496  0.0617  0.0534  0.0557     7.9371

sigma^2 estimated as 2632:  log likelihood=-1955.83
AIC=3927.66   AICc=3928.07   BIC=3958.86

Although it seems that the preceding seasonal ARIMA model fits the data with a high AIC, we might want to build a real ARIMA model by specifying the D argument resulting in an integrated model via the following estimates:

> plot(forecast(auto.arima(nts, D = 1, approximation = FALSE), 31))
Autoregressive Integrated Moving Average models

Although time-series analysis can sometimes be tricky (and finding the optimal model with the appropriate parameters requires a reasonable experience with these statistical methods), the preceding short examples proved that even a basic understanding of the time-series objects and related methods will usually provide some impressive results on the patterns of data and adequate predictions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.171.153