Stationary time series models

In this section, we will describe a few stationary time series models. As we will see, these can be used to model a number of real-world processes.

Moving average models

A moving average (MA) process is a stochastic process in which the random variable at time step t is a linear combination of the most recent (in time) terms of a white noise process. Concretely, we can write this in an equation as follows:

Moving average models

In the previous equation, and henceforth, we will assume that the e terms are white noise random variables with mean 0 and variance σw2. We can describe a moving average process in an equivalent way by making use of the backshift operator, B. The backshift operator is an operator that when applied to a random variable in a stochastic process at time t, produces the random variable at the previous time step, t-1. For example:

Moving average models

We can obtain random variables further back in time by successive applications of the backshift operator. B2, for example, indicates the application of the backshift operator twice so that we go back two time steps. With that in mind, we can express an MA process using the backshift operator as follows:

Moving average models

The expression in parentheses is a polynomial of order q in terms of the backshift operator B. This polynomial is referred to as the characteristic polynomial of an MA process.

Moving average models

An MA process is always stationary regardless of the choice of coefficients θ or the order q of the process. We are, however, interested in the roots of the equation θ(x) = 0. If the roots of this equation exceed 1 in absolute value, we say that the MA process is invertible. Invertibility is a highly desirable property of MA processes. Invertible MA processes have unique ACF plots, whereas MA processes that are not can share the same ACF plot.

For example, for an MA(1) process with a single θ coefficient, we get the same ACF plot when the coefficient is 1/θ. The reader can verify this in a short while when we present the equation for the autocorrelation function. We will now explore some of the statistical properties of the MA process. Remembering that every e term is a white noise random variable, we can see that the mean of an MA process is constantly 0 and the variance is constantly given by the following expression, where we define θ0 to be 1:

Moving average models

The autocorrelation function is given by the following:

Moving average models

Deriving the autocorrelation function is a little more tedious, so we leave it as an exercise for the interested reader. From the equation, the important observation to make is that the ACF of an MA process has nonzero values for lags below q, the order of that process, and 0 afterwards. This fact is not only useful in identifying an MA process through its ACF function, but also in estimating the order of the MA process, as we can take the largest lag that has a statistically significant nonzero value. We will simulate some MA processes to get an idea of this.

Note

For MA processes, and indeed all the time series models discussed in this chapter, there are a number of very useful references with examples in R. An excellent text from the Use R! series of Springer is Introductory Time Series by Cowpertwait and Metcalfe. We also recommend Time Series Analysis With Applications in R, Cryer and Chan, Springer.

In R, we can simulate MA processes (among others) using the arima.sim() function. This is actually a general function that we will use often in this chapter. To generate an MA process, we will use the n parameter to specify the length of the series and the model parameter to specify the parameters of the series we want to simulate. This, in turn, is a list in which we will set the vector of θ coefficients in the ma attribute and the standard deviation of the white noise terms in the sd attribute. The following is how we can generate 1,000 samples from the MA process et + 0.84et-1 + 0.62et-2, where the white noise terms have a standard deviation of 1.2:

> set.seed(2357977)
> ma_ts1 <- arima.sim(model = list(ma = c(0.84, 0.62), sd = 1.2), 
                      n = 1000)
> head(ma_ts1, n = 8)
[1] -2.403431159 -2.751889402 -2.174711499 -1.354482419
[5] -0.814139443  0.009842499 -0.632004838 -0.035627181

The arima.sim() function returns the result of the simulation in a special time series object that R calls ts. This object is useful for keeping track of some basic information about a time series and supports specialized plots that are useful in time series analysis. According to what we learned in this section, the ACF plot of this simulated time series should display two significant peaks at lags 1 and 2.

We will now plot this to confirm our expectations. In addition, we will also plot the ACF function of another simulated MA(2) process with coefficients 0.84 and -0.62 to show how we can also obtain negative values in the ACF coefficients:

Moving average models

Autoregressive models

Autoregressive models (AR) come about from the notion that we would like a simple model to explain the current value of a time series in terms of a limited window of the most recent values from the past. The equation for an AR model of order p is:

Autoregressive models

Collecting all the Y terms on the left and applying the backshift operator again yields:

Autoregressive models

Unlike the MA process, the AR process is not always stationary. The condition for stationarity relies on the characteristic equation for the AR process, which is given by:

Autoregressive models

If the roots of this equation all have magnitude greater than 1, the process is stationary. The following two relations describe a necessary (but not sufficient) requirement for this to happen:

Autoregressive models

This fact often helps us spot a non-stationary AR process through a quick examination of the coefficients. Note that a random walk is just an AR(1) with the first coefficient, φ1, equal to 1. The autocorrelation function has a more complex form than is the case with the MA process. In particular, it does not have a cut-off property as in the MA process where all lag values greater than q, the order of the MA process, are zero.

For example, the autocorrelation function for an AR(1) process is given by:

Autoregressive models

We will simulate an AR(1) process with the arima.sim() function again. This time, the coefficients of our process are defined in the ar attribute inside the model list.

> set.seed(634090)
> ma_ts3 <- arima.sim(model = list(ar = c(0.74), sd = 1.2), n = 1000)

The following ACF plot shows the exponential decay in lag coefficients, which is what we expect for our simulated AR(1) process:

Autoregressive models

Note that with the moving average process, the ACF plot was useful in helping us identify the order of the process. With an autoregressive process, this is not the case. To deal with this, we use the partial autocorrelation function (PACF) plot instead. We define the partial autocorrelation at time lag, k, as the correlation that results when we remove the effect of any correlations that are present for lags smaller than k. By definition, the AR process of order p depends only on the values of the process up to p units of time in the past.

Consequently, the PACF plot will exhibit zero values for all the lags greater than p, creating a parallel situation with the ACF plot for an MA process where we can simply read off the order of an AR process as the largest time lag whose PACF lag term is statistically significant. In R, we can generate the PACF plot with the function pacf(). For our AR(1) process, this produces a plot with only one significant lag term as expected:

Autoregressive models

Autoregressive moving average models

We can combine the moving average and autoregressive models into a single model that has both elements of a moving average process and an autoregressive process. We call the generalized model a moving average autoregressive (ARMA) model. The general equation for an ARMA(p, q) process (that is, an ARMA process with a pth order autoregressive component and a qth order moving average component) is given by:

Autoregressive moving average models

Note that a purely moving average process MA(q) can be written as the ARMA process ARMA(0, q) and that a purely autoregressive process AR(p) can be written as the ARMA process ARMA(p, 0). An ARMA process is stationary if the characteristic equation of the AR component φ(x) = 0 has roots whose magnitude is greater than unity. This is exactly the same condition as with a purely autoregressive model.

In a similar vein, the process is invertible if the characteristic equation of the MA component θ(x) = 0 has roots whose magnitude is greater than unity. To uniquely determine an ARMA process, we further require that there be no common factors in the characteristic equations of the MA and AR components because they can cancel out, allowing us to obtain an equivalent but lower order ARMA process.

There are various techniques to fit ARMA models that usually consist of two parts. The job of the first part is to identify the order of the ARMA process by finding p and q. Once these have been chosen, the second part tries to estimate values for the coefficients of the AR and MA components, for example, by minimizing the sum of square errors between the observed sequence and the estimated sequence. In R, there are methods that perform this optimization for us, but later on in this chapter when we investigate real-world data sets, we will explore a method for choosing between trained models of a different order that relies on minimizing the AIC value.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.0.85