Chapter 6: Models for Univariate Time Series

Introduction

Autocorrelations

Autoregressive Models

Moving Average Models

ARIMA Models

Infinite-Order Representations

Multiplicative Seasonal ARIMA Models

Information Criteria

Use of SAS to Estimate Univariate ARIMA Models

Conclusion

Introduction

This chapter briefly introduces the theory of Autoregressive Integrated Moving Average (ARIMA) models for univariate time series. First, the series has to be differenced if necessary to meet the assumption of stationarity. (For more information, see Chapter 5.) The “I” in ARIMA is for integrated because a series that is transformed into stationarity by differencing is called integrated. Then an Autoregressive Moving Average (ARMA) model is fitted to the stationary series of differences. This chapter, of course, also introduces these ARMA models, which are the starting point for the class of vector ARMA models, known as VARMA models, for multivariate time series.

The Box and Jenkins (1976) method for model fitting and forecasting of time series is based on ARIMA models. The method consists of three steps: (1) identification, (2) estimation, and (3) testing for model fit. The identification and the testing for model fit are performed with use of the estimated autocorrelation function. Consequently, this chapter also briefly introduces the use of autocorrelations, although in modern, more computer-intensive model fitting, such matters have lost some of their importance.

These models are extended to the multivariate situation in Chapter 8.

Autocorrelations

In Chapter 2, the first-order autocorrelation was used repeatedly to test the fit of a regression model for time series data. This test was based on the fact that models assume that the errors should be white noise. That is, all autocorrelations should be equal to zero. But this assumption is often faulty for time series data. Often, this use of autocorrelations to test lack of fit of a regression model will lead a college student to believe that autocorrelation always means trouble.

But the concept of autocorrelations is extremely useful in time series modeling. Autocorrelation means that the observed values of the time series are correlated to unknown future values. Knowledge of this relationship of dependence forms the basis of many forecasting methods. This serial dependence is then applied to derive the expectations of future values, conditioned on the knowledge of already observed values. Such conditional expected values are optimal predictions of future observations.

Autocorrelations could also be used as a tool for identifying the order of a model. The estimated autocorrelations are used in the Box and Jenkins (1976) procedure to determine the order of an ARMA model for an observed time series. The idea is to choose a specification of the model that has theoretical autocorrelations of the same form as the observed empirical autocorrelations.

The kth-order autocorrelation is theoretically defined as the correlation, ρk = corr(Xt, Xt-k), between values of the time series with a time lag of k periods. To theoretically validate this definition, the series is supposed to be stationary. Then the correlation depends only on the time lag k, but not on the time index t.

The kth-order autocorrelation is estimated by the following:

rk=1TkTt=Tk+1(XtˉX)(XtkˉX)1TTt=1(XtˉX)2rk=1Tkt=Tk+1T(XtX¯¯¯)(XtkX¯¯¯)1Tt=1T(XtX¯¯¯)2

whereˉXX¯¯¯ denotes the average of the observed series. This average is often taken as zero if the formula is applied to a series of residuals from a model. The divisions by the numbers of terms in the numerator and the denominator are only of minor importance if the number T is much larger than the lag number k.

If the series Xt is white noise, then the distribution of this empirical autocorrelation could be approximated by a normal distribution having the mean 0 and variance equal to T-1 (that is, the inverse of the number of observations). This distribution means that an empirical autocorrelation numerically larger than 2/T½ indicates an autocorrelation worth noticing. If the series are defined as residuals from a time series model, the variance for shorter lags, say k = 1 or 2, is a bit smaller than T−1 because of the model’s fitting of the autocorrelations of the original series.

To test whether all autocorrelations equal zero, the individual autocorrelations are often combined into a portmanteau test statistic, which is basically the sum of squared autocorrelations. An often applied variant is the Ljung-Box (LB) statistic, which is defined as follows:

LB=T(T+2)Kk=1r2kTk

This statistic has approximately a chi-squared distribution with degrees of freedom equal to the number of terms in the sum in the LB test, K, minus the number of parameters that are estimated in the time series model.

If the Ljung-Box test is applied to an original observed time series and not to the residuals of a fitted time series model, then r = 0. Default values for the number of autocorrelations, K, are 12 for yearly data or 25 for monthly time series.

Autoregressive Models

The autoregressive model of order p, denoted AR(p), has the following form:

Xt=μ+pj=1φj(Xtjμ)+εt

The parameter μ is the mean of the series, but it can be replaced by a linear function of the time index to incorporate a trend in the model. The remainder terms εt form a white noise series of independent, identically distributed variables. Often, these residuals are supposed to be normally distributed.

This model describes in a direct way how past values of the series itself include information about future values of the series. The forecast of a future value i steps ahead of the last observation is calculated as follows:

ˆXT+i=μ+pj=1φj(ˆXT+(ij)μ)

in the AR(p) model. The observed values XT+i−j are used in the formula for ij.

Mathematically, an AR(p) model is stationary if all the roots of the polynomial

1φ1Bφ2B2..φpBp

are outside the unit circle in the complex plane. If a root of this polynomial is on the unit circle, we have a unit root that indicates differencing is necessary to achieve stationarity. In this case, the model is an autoregressive model of order p − 1 to the series of first differences.

Parameter values φj > 0 apply to series in which deviations from the mean value have the same sign for many periods. In regression analysis, behavior is often denoted as positive autocorrelation. A simple example is a series of daily sales of ice cream. Sales might be higher than expected during periods of hot weather and lower than expected during periods of cold weather. This behavior is in conflict with standard assumptions of independence among the remainder terms. It is often seen as a shortcoming in statistical models in which some important explanatory variable is missing, such as the outdoor temperature in the ice cream example, as shown in Chapter 2.

Parameter values φj < 0 lead to a series with a jagged behavior because positive deviations from the mean are followed by negative deviations from the mean. In a sales series, a conflict in the labor market might reduce the sales one month but only postpone them to the next month. As a result, total sales over, for example, two months remain constant.

The theoretical autocorrelation function, ρk, for an AR(p) model is mathematically found as the solution to the following difference equation:

ρkφ1ρk1φ2ρk2..φpρkp=0

In practice, it is enough to know a few fundamental facts about the theoretical autocorrelation function. First, the autocorrelation function for every autoregressive model tends to zero without reaching zero. Second, for p = 1, the autocorrelations are exponentially decaying as the parameter φ1 raised to the power k:

ρk=φk1

And, finally, for models of a higher order than 1, p >1, the autocorrelation function can include oscillations around zero.

Moving Average Models

A moving average model of order q, in short MA(q), is defined by:

Xt=θ0+εtθ1εt1..θqεtq

where the parameter θ0 is the mean value. The parameters are usually assumed to meet the assumption that the roots of the polynomial (see below) are outside the unit circle.

1θ1Bθ2B2..θqBq

The remainder terms, the prediction errors εt, form a white noise series. The series εt is the unexpected part of the observation Xt because it is statistically independent of the past values Xt−1, X t−2, . . . of the series. This description can be used in intuitive interpretations of the moving average models because past values of these remainder terms can easily affect future values of the time series. If a shock εt happens to the series, the length of its persistence in an MA(q) model is given by the order q.

When an MA(1) model is fitted to the series of first-order differences, the resulting model has a clear interpretation. Consider a situation in which the change from time t − 2 to time t − 1 was unexpectedly high, meaning that Xt−1Xt−2 was positive because of a positive error term εt−1. Then you would expect that the difference XtXt−1 would be negative as predicted by the term − θ1εt−1, assuming that θ1 is positive. This is often the situation if the observed series is some kind of activity that must occur but for which the timing is not fixed. An agricultural example is the number of pigs that are slaughtered. This number might vary from month to month. But because every animal has to be slaughtered, a high number one month leaves fewer animals to be slaughtered the next month.

The autocorrelation function of an MA(q) process is easily recognized because all autocorrelations equal zero for lags larger than q:

ρk=0,k>q

The actual formulas of the autocorrelations are unimportant.

ARIMA Models

The combination of Autoregressive and Moving Average (ARMA) models is a generalization that allows for many autocorrelation structures met in practice. The simplest form of these models is the ARMA(1,1) model:

Xt=θ0+φ1Xt1+εtθ1εt1

The mean value of the series Xt in this notation is:

μ=θ01φ1

More terms can be added to both the autoregressive and the moving average parts. More terms can lead to the general model denoted as ARMA(p,q). If a differencing is also applied to obtain stationarity, the class of models is denoted ARIMA(p,d,q), where d is the number of differences applied, usually d = 0 or d = 1. For more information, see Box and Jenkins (1972) or Brocklebank and Dickey (2003).

Infinite-Order Representations

All ARMA models can be represented in terms of past forecast errors. These representations have the form of either an infinite-order moving average

Xt=i=0ψiεti,ψ0=1

or an infinite-order autoregression,

i=0πiXti=εt,π0=1

The series εt is again a series of independent, identically distributed remainder terms with a mean of 0 and a constant variance of σ2. Often the distribution is assumed to be normal. The formulas assume that the mean value of the series Xt is zero. If the mean value of the series Xt is different from zero, then the formulas should be adjusted by subtraction of the average from all values of Xt.

The infinite series in these two expressions converge if the stated conditions on roots of the polynomials are met. The infinite series can be approximated only by a finite number of terms in the summation.

The infinite-order moving average representation tells what happens if a shock (a numerically large value of the residual process εt) happens. The effect k time periods ahead is equal to the coefficient ψk. The residual variance σ2 is the variance in the conditional distribution of a one-step ahead forecast. Moreover, the variance of the prediction error for a k-step-ahead forecast is as follows:

(1+ψ21+..+ψ2k1)σ2

The representation, as an autoregression of infinite order, indicates that you can model every series by means of an autoregressive model just by choosing an order, p, that is sufficiently high. The mixed ARMA(p,q) models, having both p > 0 and q > 0, are, however, often used to reduce the number of parameters of the model. The precise identification of mixed models is a complicated task, one that is nowadays superfluous, as you will see in Chapter 7.

Multiplicative Seasonal ARIMA Models

For time series with a seasonal structure, the lags p or q can often be larger than the seasonal duration. Usually, the autoregressive order p in an AR(p) model must be at least 12 or even 24 for monthly data. But often the intermediate autoregressive parameters equal zero, leading to a model with few parameters. To accommodate seasonality, the ARIMA models are often extended by inclusion of seasonal factors. Multiplicative models are not supported by PROC VARMAX, so this subject is not pursued further in this book. For the application of multiplicative seasonal ARIMA models by procedures in SAS, see Brocklebank and Dickey (2003) or Milhøj (2013).

In PROC VARMAX, seasonality is easily modeled by dummy variables as in the examples in Chapters 25.

Information Criteria

The method of choosing among models relies on information criteria. These criteria are defined on the maximum likelihood value, which should be as large as possible, or the residual sum of squares that should correspondingly be as small as possible. To adjust for the number of parameters in the model, a punishment term is added to the criterion because more parameters always improve the basic criteria value.

Often used criteria are the Akaike information criterion (AIC)

AIC=Tlog(ˆσ2)+2r

and the Schwarz Bayesian criterion (SBC)

SBC=Tlog(ˆσ2)+log(T)r

The algorithm fits many models. You then choose the model with the smallest value according to one of these criteria. The SBC criterion gives the most severe punishment for extra parameters because log(T) > 2 for all series met in practice.

The default method in PROC VARMAX is the corrected Akaike information criterion (AICc). The correction is defined by the addition of a further punishment for the number, r, of parameters to the original definition of AIC:

AICC=Tlog(ˆσ2)+2r+2r(r+1))Tr1

Even if the model order is chosen by an automatic method and the fit of the model seems good, you need to test the estimated parameters to learn whether some are insignificant. The automatic fitting procedure does not necessarily result in the most economical model in terms of number of parameters.

Use of SAS to Estimate Univariate ARIMA Models

SAS offers many procedures for estimating time series models. PROC AUTOREG can be used to estimate some simple models as demonstrated in Chapters 25. The main procedure in SAS for fitting ARIMA models for univariate time series is PROC ARIMA, which includes all that is necessary to perform the calculations for the methods presented in the famous Box and Jenkins (1976) book, but novel contributions are also included. For a thorough treatment of all facilities offered by PROC ARIMA, see Brocklebank and Dickey (2003).

PROC VARMAX is the main procedure used in this book. It includes methods for automatically choosing models within the ARMA family. This feature means that all models of orders p and q, up to, for example, five, are fitted, and the best fit is chosen by some criterion. In this way, the identification phase of the Box and Jenkins method is avoided. Consequently, the user does not have to recognize the specific autocorrelation functions that correspond to various model orders. This feature is, of course, possible only with the use of modern computers to quickly estimate the parameters in many candidate models. Such a trial-and-error method was too time-consuming in the 1970s, when Box and Jenkins wrote their book.

Conclusion

The class of ARIMA models was invented to model autocorrelations in time series. The model fit is tested by checks that the residual autocorrelations are close to zero. If the residual autocorrelations are significantly different from zero, the model fit is inadequate. You perform the test both by running the Ljung-Box test and by looking at the individual estimated residual autocorrelations.

The output from PROC AUTOREG, PROC ARIMA, and PROC VARMAX provides the necessary test statistics, with the corresponding p-values and graphs of the estimated residual autocorrelations. The model testing is easy. Other aspects of the model to be checked might include an assumption of normality and the assumption of a constant residual variance. To check these assumptions, you can review the output for plots of the residual process, histograms, and normal probability plots for the residuals. Moreover, some numerical tests are produced.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.236.70