CHAPTER 9

Advanced Time-Series Techniques

Having learned the simple random walk model, Taila wonders what are the advanced issues with nonstationary models associated with time-series data. Prof. Metric says that her curiosity is great because we will discuss these problems in this chapter, and once we finish with it, we will be able to:

1. Analyze cases when nonstationary models can be made stationary.

2. Explain the cointegration concepts and the theoretical foundations of the related tests.

3. Discuss other time-series models.

4. Apply Excel into performing the necessary tests and estimating the models.

Prof. Metric says that we will learn how to detect a nonstationary model and how to handle a nonstationary model in the next section.

Dealing with Nonstationarity

In Chapters 5 and 6, we learned that if a model follows a random walk, we can estimate the first difference of the model to eliminate the nonstationary problem. Prof. Metric tells us that we can extend this technique to the case of the Autoregressive Distributed Lag (ARDL) model as well.

Nonstationary ARDL Model

In this case, if dependent and explanatory variables on both sides of a model follow random walk processes, each denoted as an I(1) in Chapter 5, then we can perform an Ordinary Least Squares (OLS) estimation on a model with differences on both sides to make each of them a stationary series, called an I(0) process in Chapter 5. The general equation is written as:

image

We notice that Equation (9.1) contains the first differences Δyt−1 and Δxt−1 in addition to the Δyt and Δxt because the involvement of random walk processes on both sides complicates the regression results. The additional first differences are to guarantee that the series will become stationary. The model is still considered first-difference stationary and can have a constant term added without affecting the consistency of the estimator.

We then work on an example. Taila offers the results from an estimation she performed for her company’s profit as follows:

PROFITt = PROFITt−1 + 0.11 PROFITt−2 + 0.6 ΔADSt−1

where PROFIT equals the company’s monthly revenue minus its monthly costs, and ADS is its expenditures on advertisements. She took the first difference of PROFIT to make ΔPROFITt an I(0) process, to make the following equation:

ΔPROFITt = PROFITtPROFITt−1 = 0.11 PROFITt−2 + 0.6 ΔADSt−1

Profits are known as PROFITt−1 = $6,000; PROFITt−2 = $8,000; and ADSt−1 = $3000.

We are able to calculate the change in predicted value for her company as follows:

ΔPROFITt = 0.11 PROFITt−2 + 0.6 ΔADSt−1 = 0.11*8,000 + 0.6*3,000 = 880 + 1,800 = 2,680 ($)

Hence, the predicted profit for her company is:

PROFITt = ΔPROFITt + PROFITt−1 = 2,680 + 6,000 = 8,680($)

Dr. Theo then reminds us that the first differenced model can be extended to panel data. The suitable case for using first difference in panel data is that the error term follows a random walk as analyzed in Wooldridge (2013).

Invo raises his hand and asks,

Equation (9.1) is already very cool because it helps us overcome the problem of nonstationary variables in ARDL model. However, the equation only predicts the change in the dependent variable based on the changes in the explanatory variables. Since the ARDL model is complicated, it is not easy to recover all the variables in their original forms instead of in the changes to the variables. It would be even cooler if I could know how y depends on x, instead of just their relationship in the differences. How can I find this out?

Prof. Metric praises him for a good question and says that if x and y share a common trend, then we can perform regressions of y on x without worrying about spurious results. There are two cases when x and y both share a common trend. The first is that x and y are both trend stationary, and the second case is that x and y are cointegrated. In both cases, OLS estimators will be consistent (Wooldridge 2013).

Trend Stationarity

Given the original model:

image

where both x and y are nonstationary.

Suppose x and y can become stationary after being adjusted for a trend:

yt = b1 + b2t + ut
xt = c1 + c2t + vt

Where ut and vt are random variables with zero means and constant variances:

var(vt) = image, and cov(vt , vz) = 0; tz

var(ut) = image, and cov(vt , vz) = 0; tz

In this case, we can estimate the following model:

image

where image

The series y* and x* are the detrended data, and the OLS estimator of the model in Equation (9.3) will be consistent.

Cointegration

Given the original model in Equation (9.2):

yt = a1 + a2xt + et

where yt and xt both follow random walk processes. If yt and xt are said to be cointegrated, then the OLS estimators for Equation (9.2) will be consistent for a1 and a2. The condition for cointegration is that the error term is an I(0), which is a stationary process as discussed in Chapter 5. Hence, if the model in Equation (9.2) has both x and y follow I(1) processes, we only need to find out if the error in the model follows an I(0) process. From Chapter 2, we know that the error term is written as:

image

If this error term et follows an I(0) process, performing an OLS will yield consistent estimators.

With this remark, Prof. Metric leads us to the next subsection on testing for stationarity and cointegration.

Unit-Root Tests

We learn that a unit-root test can be performed to detect nonstationarity and cointegration.

Detecting Nonstationarity

This test is called a “unit-root test” because a series is nonstationary if |a| =1 and stationary if and only if |a| < 1. The test for stationarity is also called a Dickey–Fuller test. The original equation to derive the test is written as:

image

Subtracting yt−1 from both sides yields:

image

When a = 1, that is, it has a unit root, then b = 0, so Equation (9.6) is very convenient, because t-statistics in all quantitative packages are for tests of significance with the null hypothesis, b = 0. The four steps for a Dickey–Fuller test are similar to those in Chapter 3 for a t-test, with the hypotheses written as follows:

  (i) H0: b = 0; Ha: b < 0

To perform the Dickey–Fuller test, the model in Equation (9.6) is often extended to allow for a constant term and a trend. The model with the constant term is:

image

The model that adds the trend is:

image

Prof. Metric says that these three models are usually estimated concurrently so that the most appropriate model is selected based on whether the constant term or the trend is significant.

 (ii) Tau-statistics (τ-statistics): When y is nonstationary, the variance of y is inflated. Hence, the statistic follows a τ (tau) distribution instead of a t-distribution and is called a τ-statistic. However, the formula is the same as that for the t-statistic:

image

(iii) The τ-critical values: Table 9.1 displays the most important critical values for Dickey–Fuller tests. The complete table is listed in Fuller (1976).

(iv) Decision: If |τ-statistics| > |τ-critical|, we reject H0, meaning b < 0, implying the model is stationary.

Prof. Metric reminds us that the meaning and implication of our decision in this test is the complete opposite to that of the Lagrange Multiplier (LM) tests for heteroscedasticity and autocorrelation. In a LM test, we will be miserable if the null hypothesis is rejected because we then know that the model has either a heteroscedasticity or autocorrelation problem. In Dickey–Fuller test, we will jump with joy if the null hypothesis is rejected because we then know that the model is stationary and so no correction is needed. Note that all of the unit-root tests have low power, but even less so if the model is missing a structural break, for example a change in the direction of the slope, or other serious forms of misspecification.

We can extend the aforementioned models to allow for more lags while still being able to use the critical values listed in Table 9.1.

In this case, the test is called the augmented Dickey–Fuller test, which also has the advantage of eliminating autocorrelation thanks to the additional differences of lagged dependent variables. All three equations can be used along with Equation (9.6) to become augmented to the following equation:

Table 9.1 Main excerpts of critical values for Dickey–Fuller t distribution

Significance Level

0.01

0.025

0.05

0.10

For model in (9.6)

–2.58

–2.23

–1.95

–1.62

For model in (9.7)

–3.43

–3.12

–2.86

–2.57

For model in (9.8)

–3.96

–3.66

–3.41

–3.12

Source: Reformatted from Fuller (1976).

image

The augmented versions of Equation (9.7) with the constant added is written as:

image

The model that adds the trend is written as:

image

Testing processes for these augmented models can be performed in similar manners. Prof. Metric says that we will have the opportunity to perform the Dickey–Fuller test later with Prof. Empirie in the Data Analysis section.

Testing Cointegration

A unit-root test is also used to test cointegration for the model in Equations (9.2) and (9.3). We want to find out if the error term is an I(0), where

image

The cointegration test requires that we estimate an equation equivalent to Equation (9.6) for the residuals:

image

For the augmented cointegration test, the equation is written as:

image

Table 9.2 Main excerpts of critical values for cointegration tests

Number of variables

Singnificace level

(including y)

0.01

0.05

0.10

y and one x

–3.90

–3.34

–3.04

y and two x’s

–4.29

–3.74

–3.45

y and three x’s

–4.64

–4.10

–3.81

y and four x’s

–4.96

–4.42

–4.13

Source: Reformatted from Davidson and MacKinon (2004).

There is a complication in a cointegration test. Since we have to estimate the original model in Equation (9.2) using an OLS procedure to obtain image, the OLS, being famously designed as the ordinary “least squares,” will always choose as small a sample variance as possible. Thus, the OLS procedure will make the errors look as stationary as possible causing us to jump for joy and reject the null of nonstationarity; however, the errors might in reality be nonstationary.

To overcome this problem, great econometricians have developed new tables for us with critical values suitable for this test. Every time an explanatory variable is added to the model, the critical value shifts further to the left. An adapted version of these tables with the most important critical values is displayed in Table 9.2.

We are relieved that the test is performed in four standard steps, as those for the Dickey–Fuller test, and the interpretation is similar, that is, rejecting the null implies that x and y are cointegrated and using OLS to estimate the model in Equation (9.2) will yield consistent estimators.

Other Models

Vector Autoregressive (VAR) Model

Prof. Metric reminds us that we were introduced to simultaneous equation estimation in Chapter 7 and says that we can extend this concept to the case of nonstationary models now. Suppose that we have a system of two equations as follows

image

Since the first equation involves lagged values of yt and the second equation involves lagged values of xt, each equation is similar to an AR model. The involvement of an explanatory variable, other than the lagged dependent variable, makes each of them look similar to an ARDL model. However, we are considering the system as a whole, hence the name VAR model, where the word “vector” stands for more than one series and equation.

If x and y follow I(0) processes, System (9.15) can be estimated using OLS applied to each equation. Nonetheless, if x and y are nonstationary variables that follow I(1) processes and are not cointegrated, then we can take the first differences, and the VAR model is written as:

image

This first difference form transforms all I(1) variables to I(0) variables, and System (9.16) can be estimated using an OLS procedure. Although System (9.15) can be theoretically called a VAR model, it is common practice among econometricians to refer to System (9.16) as a VAR model. Hence, all econometricians will think that we are estimating System (9.16) if we mention that we will use a VAR model.

Vector Error Correction (VEC) Model

Prof. Metric says that System (9.15) might have x and y follow I(1) processes, but if they are cointegrated, then we need to modify this system to exploit this cointegrated relationship. This modifying version is called a VEC model. There are two advantages of using this VEC model:

 (i) We can retain variables that are stationary in levels instead of in differences.

(ii) We can use the best technique available in estimating the model.

Theoretically, we can estimate System (9.15) using OLS because we can modify it by introducing a third equation relating yt and xt:

image

Going backward one period yields:

image

Since x and y are cointegrated, et and et−1 are I(0) variables and we can solve for et−1 from Equation (9.18):

image

Since the left-hand side of Equation (9.19) is stationary, its right-hand side is stationary as well, and the VEC model is written as:

image

We can expand Equation (9.20) as follows:

image

Gathering the terms on both sides of the system yields:

image

Hence, we can use OLS to estimate the following system:

image

where

image

Invo exclaims, “Oh! Now we can estimate the model in levels instead of in differences.” Prof. Metric praises him for the good observation and enthusiasm and says that the estimation and interpretation of each equation in System (9.21) is straight forward as in any OLS model. He then tells us that if our purpose is to estimate Equation (9.17) and the two series are cointegrated, we can use a different VEC system by manipulating the error term on the left-hand side of Equation (9.19). In this case the model is written as:

image

Taila then asks the class if any of us know how to obtain et−1 in System (9.22). Booka says, “I think we can generate it by writing image.” Prof. Metric cheerfully says that her suggestion is great, and the model is written as:

image

The interpretation of a11 and a21 in System (9.23) is different from the ones in System (9.21). In this case, they reveal how each change in the dependent variables reacts to the cointegrating error. For example, if y is consumption and a11 = 0.02 with a p-value of 0.01, then the change in consumption reacts strongly to the cointegrating error and the next-period consumption will experience a 2 percent deviation from its cointegrating value of a1 estimated for the current period in Equation (9.17). Similar interpretation can be applied for x in the second equation of System (9.23).

We have to be careful not to mix stationary and nonstationary variables in OLS regressions. For example, the left-hand sides of System (9.22) are written in changes of the dependent variables because the right-hand sides use I(0) variables, so the right-hand sides of this system have to be expressed in I(0) variables as well. Therefore, we cannot write x and y on the left-hand sides in levels because they are I(1) variables, and explanatory variables can be added to the left-hand sides of System (9.22) only if they are I(0) variables. This important point is discussed in detail by Hill, Griffiths, and Lim (2011).

Prof. Metric says that Equations (9.21) and (9.23) play an important role in macroeconomics because consumption and interest rates are both I(1) variables. Using either of these models will control for these nonstationary variables in the system.

Autoregressive Conditional Heteroskedastic (ARCH) Model

Touro asks, “We have discussed all models that seem to have time-varying means because they change over time, but I wonder what would happen with a model that has changing variances over time.” Prof. Metric is very pleased with the question and says that a time-varying model is called an ARCH model. This model is very good for estimating financial markets which are volatile by nature. This ARCH model is written as:

image

image

image

The subscript t in Equation (9.24b) indicates that image is time varying, that is, it is heteroskedastic over time, and we have a function of time, image which is written as a model with a constant term and the lagged error squared. The coefficients b0 and b1 must be positive so that the variance is positive, and b1 must be stationary.

Equation (9.24c) lets the distribution of the error be conditionally normal, with It-1 representing the information available at time (t−1). This condition requires that the distribution is a function of known information in the previous period. For example:

image

This assumption implies that et as a normal distribution conditional on a known value of image in the previous period.

To test whether a model has an ARCH effect, an LM test similar to the ones introduced in Chapters 4 and 5 is used. For the first order ARCH, we need to perform the following steps:

  (i) Estimate Equation (9.24a) and obtain the residuals image and generate image.

 (ii) Perform the following regression:

image

(iii) The null and alternative hypotheses are written as:

Ho: c1 = 0; Ha: c1 ≠ 0

(iv) The LM test statistic is:

image

where T is the sample size and J is the number of lagged residuals on the right-hand side. For Equation (9.25), J = 1. R2 is the coefficient of determination from estimating Equation (9.25).

If c1 = 0, then there is no ARCH effect. If c1 ≠ 0, then there are ARCH effects, and the magnitude of the residuals, image, will depend on image. In this case, the model should be estimated using an ARCH technique instead of OLS.

Generalized ARCH (GARCH) Model

The ARCH (1) model can be extended to an ARCH (j) model that allows for more lagged values, and the variance function can be written as:

image

The problem with an ARCH(j) model is that we will have to estimate (J + 1) parameters, including the constant term. If J is large, we might lose accuracy in the estimation. The generalized ARCH (GARCH) model helps to capture these long lagged effects with only three parameters. To develop this GARCH model, rewrite Equation (9.27) as:

image

This is a geometric series of lagged values that follows the form image. We then add and subtract b1c0 while reorganizing the terms to obtain:

image

Retrogressing Equation (9.28) one period to obtain:

image

Combining Equations (9.29) and (9.30) yields:

image

where d = (c0b1c0). We also need c1 + b1 < 1 for a stationary model.

The model in Equation (9.31) is called a GARCH (1,1) model. If c1 + b1 ≥ 1, we have an integrated GARCH model (IGARCH). Prof. Metric says that the GARCH (1,1) model is very popular in financial estimation because it fits many time series data well.

Autoregressive Moving Average (ARMA) Models

A model, which has the dependent variable correlated with a lagged value of its error, is called a moving average of order one and is denoted as an MA(1):

image

A combination of an AR(1) introduced in Chapter 5 and an MA(1) yields an ARMA(1,1) model:

image

An MA(m) denotes an MA model of higher orders with m as the order. If taking the first difference yields a stationary series, the ARMA(1,1) becomes an Autoregressive Integrated Moving Average model of order one ARIMA(1,1,1), where the letter I in the middle of the word ARIMA again stands for “integrated.” A general ARIMA model is expressed as an ARIMA(n, d, m), where d is the order of integration (how many times we need to take the differences to make it stationary).

To choose n and m in the ARIMA (n, d, m) model, a Box–Jenkins procedure that consists of three steps is performed. The first step is the “identification,” in which a model is developed on the basis of autocorrelation functions. The second step is the “estimation,” in which time series analyses are employed to estimate the parameters. The third step is the “diagnostic checking,” in which diagnostic tests and a residual analysis are carried out to check the model adequacy. The procedure can be repeated to seek out the most suitable model. Prof. Metric says that the ARIMA model is a very advanced topic that he cannot discuss in detail in this book and that we should find more detailed information in Hamilton (1994), if we are interested in advanced time-series analysis.

Data Analysis

Unit-Root Test

Touro is doing research on tourist expenditures (EXPN) over the years and show us the data, which is available in the file Ch09.xls, Fig. 9.1. We first estimate Equation (9.6) by regressing ΔEXPNt on EXPNt-1 without a constant term:

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is E1:E99, the input X range is D1:D99.

Check the box Labels and Constant is Zero.

Check the button Output Range and enter G1 then click OK.

Click OK again to overwrite the data.

We then continue with the second regression of ΔEXPNt on EXPNt−1 for Equation (9.7), which has the constant term added to the model:

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is E1:E99, the input X range is D1:D99.

Check the box Labels (this time do not check on the box Constant is Zero).

Check the button Output Range and enter G20 then click OK.

Click OK again to overwrite the data.

Finally, we perform the third regression of ΔEXPNt on EXPNt-1 and TIME for Equation (9.8):

Go to Data then data Analysis, select Regression then click OK.

The input Y range is E1:E99, the input X range is C1:D99.

Check the box Labels.

Check the button Output Range and enter G39 then click OK.

Click OK to overwrite the data.

The t-statistics for the three regressions are in the data file with the values −0.04427 in Cell J18 for Equation (9.6), −3.9402 in Cell J37 for Equation (9.7), and −4.6740 in Cell J58 for Equation (9.8).

Comparing with τ-critical values in Table 9.1, the null hypothesis of unit root is not rejected for Equation (9.6), implying nonstationarity, but they are rejected for Equation (9.7) and (9.8), implying stationarity, and the time trend is significant with a p-value of 0.0133. Hence, we should use the results for the third model.

Estimating the Model

Figure 9.1 displays the full results for Equation (9.8), which starts from Cell G39 in the data file. From this figure, the estimated results are reported as follows:

ΔEXPNt = 7,184,826 − 9197 TIME − 0.3609EXPNt−1

(se) (1,510,262) (3646) (0.0772; R2 = 0.1933; Adjusted R2 = 0.1763.

Prof. Empirie points out that all estimated coefficients are statistically significant, but the R2 values are low, implying there might be omitted variables. This makes sense because tourist expenditures depend on many factors other than past expenditures and a time trend.

image

Figure 9.1 Estimation results for Equation (9.8)

Cointegration Test

The cointegration test is performed in a similar manner as the Dickey–Fuller test, except that we have to use the critical values in Table 9.2 instead of those in Tables 9.1. Booka offers a dataset of consumer demand (DEM) for books from her company that depends on income (INCM). The dataset is available in the file Ch09.xls. Fig.9.2-9.3. Booka tells us that both series follow I(1) processes when she performed using Dickey–Fuller tests at her company. For practical purposes, she only tested the model without a constant like the one in Equation (9.6). Hence, we need to perform a cointegration test to see if the two series are cointegrated, by estimating Equation (9.13). We first perform a regression of Equation (9.2) without a constant:

image

Here are the steps for this regression:

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is B1:B34, the input X range is C1:C34.

Check the box Labels, Constant is Zero, and Residuals.

Check the button Output Range and enter H1 then click OK.

Click OK again to overwrite the data.

image

Figure 9.2 Regressing DEMt on INCMt: Estimation results

The results are displayed in Figure 9.2.

From this figure, we see that income affects demand for books positively and significantly, and the R2 value is high. However, if the two series are not cointegrated, these results are not reliable. Therefore, we have to obtain the residuals and generate their lagged values for the cointegration test by doing the following steps:

Copy the residuals in Cells J25 through J 57 and paste into Cells D2 through D34.

Also paste these residuals into Cells E3 through E35 to generate their lagged values.

Type = D3–E3 into Cells F3 and copy and paste the formula into Cells F4 through F34.

We now perform the regression of Δet on et-1 as follows:

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is F3:F34, the input X range is E3:E34.

Uncheck the box Labels and Residuals, but check the box Constant is Zero.

Check the button Output Range and enter R1 then click OK.

Click OK again to overwrite the data.

image

Figure 9.3 Cointegration test: Results

The results are displayed in Figure 9.3.

Using the information in Table 9.2, we see that the null hypothesis of no cointegration is rejected, implying the two series are cointegrated. Hence the VEC technique can be used to estimate either model similar to Systems (9.21) or (9.23)

For our convenience, Booka already rearranged the data into the file Ch09.xls. Fig.9.4-9.5. Since estimating System (9.21) amounts to a standard OLS regression for each equation. For your convenience, the cointegrating value of 0.0452 in Cell I18 of Figure 9.2 has been copied and pasted into Cell I2 of the file Ch09.xls. Fig.9.4-9.5. Prof. Empirie tells us to obtain image and estimate System (9.23) but to eliminate the constant to be consistent with Booka’s model.

image

Here are the steps to obtain image and estimate the demand equation in System (9.35):

Type = F2 + $I$2*G2 into Cell H2.

Copy and paste the formula in Cells H2 into H3 through H34.

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is B1:B34, the input X range is H1:H34.

Check the box Labels and Constant is Zero.

Check the button Output Range and enter J1 then click OK.

Click OK again to overwrite the data.

The results of the demand equation are displayed in Figure 9.4.

image

Figure 9.4 Results for system (9.35): Demand equation

From this figure, we see that the change in demand does not react to the cointegrating error: the error term has a p-value of 0.5571. Hence, there is no deviation of DEM from the cointegrating value of 0.0452 in Figure 9.2. We then estimate the income equation in System (9.35):

Go to Data then Data Analysis, select Regression then click OK.

The input Y range is C1:C34, the input X range is H1:H34.

Check the box Labels and Constant is Zero.

Check the button Output Range and enter T1 then click OK.

Click OK again to overwrite the data.

The results for the income equation are displayed in Figure 9.5.

From this figure, we see that the change in income does not react to the cointegrating error either. The conclusion for this case is that the cointegration of DEM and INCM produces consistent OLS estimators which reveal a stable relationship of the two variables over time.

Prof. Empirie says that if the two series are not cointegrated, then we can take the first differences on both sides of the equation. Once the differences are generated, we can apply the first difference technique introduced in Chapters 5 and 6 to estimate a VAR system similar to System (9.16). She also tells us that estimating the other advanced models requires special software or add-in tools to Excel and so is beyond the scope of this textbook.

image

Figure 9.5 Results for system (9.35): Income equation

Exercises

1. The file Maui.xls contains data on petroleum imports in Maui, Hawaii. Perform the Dickey–Fuller tests on Equations (9.6), (9.7), and (9.8).

2. It is well known that consumption of nondurable goods and the interest rate are following I(1) processes. Data on consumption of nondurable goods (CONS) and real interest rate (RINT) are in file CONS.xls. Use these data to perform a cointegration test on the following model:

CONSt = a0 + a1RINTt + et

3. Use the same data in file CONS.xls to estimate the following VEC model:

image

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.63.13