3
Multivariate time series regression models

Regression analysis is one of the most commonly used statistical methods. It is covered in most undergraduate and graduate statistical courses. However, the method discussed in these courses is the standard multiple regression model with one response variable. In this chapter, we will introduce multivariate time series regression models with several response variables. We will illustrate this method using many examples.

3.1 Introduction

In this chapter, we will discuss several different formulations of multivariate time series regression models. The multiple regression is one of the most commonly used statistical models, so we will start with its multivariate representation in the next section. Other extensions and representations will be introduced in Sections 3.3 and 3.4. They include the representation adapted from the vector autoregressive models, which will be referred to as vector time series regression models. The VARX model is another extension. We will discuss the similarities and differences among these extensions and presentations.

3.2 Multivariate multiple time series regression models

3.2.1 The classical multiple regression model

In a multiple regression model, a response variable Y is related to k predictor variables, X1, X2, …, Xk, as follows,

where ξ is assumed to be uncorrelated white noise, often as i.i.d. N(0, σ2). When time series data are used to fit a multiple regression model, we often write Eq. (3.1) as

where t refers to time,

equation

and in time series regression ξt is normally assumed to follow a time series model such as AR(p).

When we have time series data from time t = 1 to t = n, we can present Eq. (3.2) in the matrix form,

where

equation

and ξ follows a n‐dimensional multivariate normal distribution N(0, Σ). Given Σ, the generalized least squares estimator (GLS)

(3.4)equation

is known to be the best unbiased estimator in the sense that for any constant vector c, the estimator

equation

has the smallest possible variance among all other unbiased estimators images of β in the form

equation

3.2.2 Multivariate multiple regression model

Now, suppose that instead of one response variable in Eq. (3.2), we have m response time series variables related to these k predictor time series variables, that is,

equation

or

where

equation
equation
equation

and

equation

For i = 1, 2, …, m and time t = 1 to t = n, let

equation

The matrix form of the multiple regression for the ith response variable of images is

which, as expected, is exactly the same as Eq. (3.3). Putting all the multiple regressions for the m response variables together from t = 1 to t = n, we have

where

equation

equation

and

equation

Each ξ(i) follows a n‐dimensional multivariate normal distribution N(0, Σ(i)), i = 1, …, m, and ξ(i) and ξ(j) are uncorrelated for i ≠ j. We will call the model given in Eq. (3.7) the multivariate multiple time series regression model.

3.3 Estimation of the multivariate multiple time series regression model

3.3.1 The Generalized Least Squares (GLS) estimation

As noted in Eq. (3.6), the ith response Y(i) actually follows the general multiple time series regression model

or

(3.9)equation

where ξ(i) = [ξi,1, ξi,2, …, ξi,n] follows a n‐dimensional multivariate normal distribution N(0, Σ(i)). In the time series regression, ξi,t is often assumed to follow a time series model such as AR(p). From the results of the multiple regression, we know that when Σ(i) is known, the GLS estimator

(3.10)equation

is the best unbiased estimator.

Normally, we will not know the variance–covariance matrix Σ(i) of ξ(i). Even if ξi,t follows a time series model such as AR(p) or ARMA(p, q), the Σ(i) structure is not known because the related time series model parameters are usually unknown. In this case, we use the following GLS procedure suggested in Wei (2006, Chapter 15):

  • Step 1: Calculate the ordinary least squares (OLS) residuals images from the OLS fitting of the model in (3.8).
  • Step 2: Estimate the parameters of the assumed time series model based on the OLS residuals images.
  • Step 3: Compute images from the estimated model for ξi,t obtained in step 2.
  • Step 4: Compute the GLS estimator, images, using the images obtained in step 3.

Compute the residuals images from the GLS model fitting in step 4, and repeat step 1 through step 4 until a convergence criterion (such as the maximum absolute value change in the estimates between iterations becomes less than a specified quantity) is reached.

Combining images for i = 1, …, m, we get

(3.11)equation

where

equation

and the estimate of the variance–covariance matrix Σ(i) of ξ(i) is given by step 3 in the last GLS iteration.

It should be pointed out that although the error term can be autocorrelated in the time series regression model, it should be stationary. A nonstationary error structure could produce a spurious regression where a significant regression can be achieved for totally unrelated series as pointed out by Abraham and Ledolter (2006), Chatterjee, Hadi, and Price (2006), Draper and Smith (1998), Granger and Newbold (1986), and Phillips (1986).

3.3.2 Empirical Example I – U.S. retail sales and some national indicators

3.4 Vector time series regression models

3.4.1 Extension of a VAR model to VARX models

Recall from Chapter 2 that the m‐dimensional vector autoregressive model, VAR(p), is given by

where θ0 is a m × 1 constant vector, Φi are m × m parameter coefficient matrices, at is a sequence of m‐dimensional vector white noise process, VWN(0, Σ). Eq. (3.16) can be extended to the following

where a response vector Yt is related to k predictor vectors, X1,t, …, Xk,t, and the error vector, ξt, is a m‐dimensional Gaussian vector white noise process, VWN(0, Σ). To make the model in Eq. (3.17) more general, some or all of the predictor vectors do not need to have the same dimension as the response vector Yt. For example, instead of the dimension m, Xi,t can have a dimension r. In such a case, the dimension of the associated parameter coefficient matrix Φi will be m × r, which will no longer be a square matrix like those in the VAR(p) model.

For the multivariate time series regression, some software packages, such as MATLAB (2017), use the following model,

where Xt is a m × r design matrix for r exogenous variables. Since the model involves the VAR structure for Yt and predictor vector Xt, it is known as a VARX model. However, it should be noted that in Eq. (3.18) the associated regression coefficients β corresponding to the r exogenous variables is a r × 1 vector, which implies that the column entries of Xt share a common regression coefficient for all t, and this is relatively restrictive. In this formulation, the VARX model in Eq. (3.18) without lagged response vector variables Yt − j does not reduce to the multivariate multiple regression model given in Eq. (3.5).

Another representation of the VARX model is given by

where Φi is a m × m parameter matrix for Yt − i, Xt is a r‐dimensional time series vector for the r exogenous variables, and Θi is a m × r parameter matrix for Xt − i. This representation is used by some other software such as SAS and is called the VARX(p,s) model, and it is the form that we recommend to use. The parameter estimation of vector time series regression models is achieved through either the least squares (LS) or the maximum likelihood (ML), similar to those of vector time series models introduced in Chapter 2. Once the model is fitted, it can be used to forecast Yt + ℓ as follows

(3.20)equation

where a separate vector time series model of Xt may need to be constructed for images j ≥ 0.

The forecasting procedures are exactly the same as those discussed in Chapter 2. Rather than repeating them, we will look at some useful empirical examples instead.

3.4.2 Empirical Example II – VARX models for U.S. retail sales and some national indicators

3.5 Empirical Example III – Total mortality and air pollution in California

Projects

  1. Find m response and k predictor time series variables with m ≥ 3 and k ≥ 5 of your interest. Build a multivariate multiple time series regression model with a written report and associated software code.
  2. Build a vector time series regression model on the response and predictor variables from Project 1 with a written report and associated software code.
  3. Build a vector time series VARX(p,s) model on the response and predictor variables from Project 1 with a written report and associated software code.
  4. Use (n – 3) observations to estimate the models obtained from Projects 2 and 3. Forecast the next three periods of the response variables and compare the forecast results from the two models.
  5. Find a social science or natural science related time series data set, which includes multivariate responses and predictors, construct a multivariate multiple time series regression analysis, and complete with a written report and analysis software code.

References

  1. Abraham, B. and Ledolter, J. (2006). Introduction to Regression Modeling. Thompson Brooks/Cole.
  2. Box, G.E.P. and Tiao, G.C. (1975). Intervention analysis with applications to economic and environmental problems. Journal of the American Statistical Association 70: 70–79.
  3. Chatterjee, S., Hadi, A.S., and Price, B. (2006). Regression Analysis by Examples, 4e. Wiley.
  4. Draper, N.R. and Smith, H. (1998). Applied Regression Analysis, 3e. Wiley.
  5. Granger, C.W.J. and Newbold, P. (1986). Forecast Economic Time Series, 2e. Academic Press.
  6. Islam, M.Q. (2017). Estimation and hypothesis testing in multivariate linear regression models under non‐normality. Communication of Statistics, Theory and Methods 46: 8521–8543.
  7. Lütkepohl, H. (2007). New Introduction to Multiple Time Series Analysis. Springer.
  8. MATLAB (2017). https://www.mathworks.com/help/matlab
  9. Pankratz, A. (1991). Forecasting with Dynamic Regression Models. Wiley.
  10. Phillips, P.C.B. (1986). Understanding spurious regressions in econometrics. Journal of Econometrics 33: 311–340.
  11. Rao, C.R. (2002). Linear Statistical Inference and its Applications, 2e. Wiley.
  12. SAS Institute, Inc. (2015). SAS/ETS User's Guide. Cary, NC: SAS Institute, Inc.
  13. Wei, W.W.S. (2006). Time Series Analysis – Univariate and Multivariate Methods, 2e. Pearson Addison‐Wesley.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.154.139