Multivariate GARCH Processes
While the volatility of univariate series has been the focus of the previous chapters, modeling the comovements of several series is of great practical importance. When several series displaying temporal or contemporaneous dependencies are available, it is useful to analyze them jointly, by viewing them as the components of a vector-valued (multivariate) process. The standard linear modeling of real time series has a natural multivariate extension through the framework of the vector ARMA (VARMA) models. In particular, the subclass of vector autoregresslve (VAR) models has been widely studied in the econometric literature. This extension entails numerous specific problems and has given rise to new research areas (such as cointegration).
Similarly, it is important to introduce the concept of multivariate GARCH model. For instance, asset pricing and risk management crucially depend on the conditional covariance structure of the assets of a portfolio. Unlike the ARMA models, however, the GARCH model specification does not suggest a natural extension to the multivariate framework. Indeed, the (conditional) expectation of a vector of size m is a vector of size m, but the (conditional) variance is an m × m matrix. A general extension of the univariate GARCH processes would involve specifying each of the m(m + l)/2 entries of this matrix as a function of its past values and the past values of the other entries. Given the excessive number of parameters that this approach would entail, it is not feasible from a statistical point of view. An alternative approach is to Introduce some specification constraints which, while preserving a certain generality, make these models operational.
We start by reviewing the main concepts for the analysis of the multivariate time series.
11.1 Multivariate Stationary Processes
In this section, we consider a vector process (Xt)t of dimension m, Xt = (X1t,…, Xmt)'′. The definition of strict stationarity (see Chapter 1, Definition 1.1) remains valid for vector processes, while second-order stationarity is defined as follows.
Definition 11.1 (Second-order stationarity) The process (Xt) is said to be second-order stationary if:
(i) < ∞
t
, i = 1,…, m;
(iii) Cov(Xt, Xt+h) = E[(Xt − μ)(Xt+h − μ)′] = (h),
t, h
.
The function (·), taking values in the space of m × m matrices, is called the autocovariance function of (Xt).
Obviously X(h) =
X(−h)′. In particular,
X(0) = Var(Xt) is a symmetric matrix.
The simplest example of a multivariate stationary process is white noise, defined as a sequence of centered and uncorrelated variables whose covariance matrix is time-independent.
The following property can be used to construct stationary processes by linear transformation of another stationary process.
Theorem 11.1 (Stationary linear filter) Let (Zt) denote a stationary process, Zt
m. Let (Ck) k
denote a sequence of nonrandom n × m matrices, such that, for all i = 1, …, n,for all j = 1,… ,m,
, where
. Then the
n-valued process defined by
is stationary and we have, in obvious notation,
![c11ue001_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue001_fmt.jpg)
The proof of an analogous result is given by Brockwell and Davis (1991, pp. 83–84) and the arguments used extend straightforwardly to the multivariate setting. When, in this theorem, (Zt) is a white noise and Ck = 0 for all k < 0, (Xt) is called a vector moving average process of infinite order, VMA(∞). A multivariate extension of Wold’s representation theorem (see Hannan, 1970, pp. 157–158) states that if (Xt) is a stationary and purely nondeterministic process, it can be represented as an infinite-order moving average,
where (t) is an (m × 1) white noise, B is the lag operator,
, and the matrices Ck are not necessarily absolutely summable but satisfy the (weaker) condition
for any matrix norm
·
. The following definition generalizes the notion of a scalar ARMA process to the multivariate case.
Definition 11.2 (VARMA(p, q) process) An m-valued process (Xt)t
is called a vector ARMA process of orders p and q (VARMA(p, q)) if (Xt)t
is a stationary solution to the difference equation
where (t) is an (m × 1) white noise with covariance matrix
, c is an m × 1 vector, and
(z) = Im − Φ1z − … −
and Ψ(z) = Im − Ψ1z − … −
are matrix-valued polynomials.
Denote by det(A), or more simply |A| when there is no ambiguity, the determinant of a square matrix A. A sufficient condition for the existence of a stationary and invertible solution to the preceding equation is
![c11ue002_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue002_fmt.jpg)
(see Brockwell and Davis, 1991, Theorems 11.3.1 and 11.3.2).
When p = 0, the process is called vector moving average of order q (VMA(q)); when q = 0, the process is called vector autoregressive of order p (VAR(p)).
Note that the determinant |Φ(z)| is a polynomial admitting a finite number of roots z1, …, zmp. Let δ = mini |zi| > 1. The power series expansion
where A* denotes the adjoint of the matrix A (that is, the transpose of the matrix of the cofactors of A), is well defined for |z| < δ, and is such that Φ(z)−1Φ(z) = I. The matrices Ck are recursively obtained by
11.2 Multivariate GARCH Models
As in the univariate case, we can define multivariate GARCH models by specifying their first two conditional moments. An m-valued GARCH process (
t), with
t = (
1t, …,
mt)′, must then satisfy, for all t
,
The multivariate extension of the notion of the strong GARCH process is based on an equation of the form
where (ηt) is a sequence of iid m-valued variables with zero mean and identity covariance matrix. The matrix
can be chosen to be symmetric and positive definite1 but it can also be chosen to be triangular, with positive diagonal elements (see, for instance, Harville, 1997, Theorem 14.5.11). The latter choice may be of interest because if, for instance,
is chosen to be lower triangular, the first component of
t only depends on the first component of ηt. When m = 2, we can thus set
where ηit and hij,t denote the generic elements of ηt and Ht.
Note that any square integrable solution (t) of (11.6) is a martingale difference satisfying (11.5).
Choosing a specification for Ht is obviously more delicate than in the univariate framework because: (i) Ht should be (almost surely) symmetric, and positive definite for all t; (ii) the specification should be simple enough to be amenable to probabilistic study (existence of solutions, stationarity, …), while being of sufficient generality; (iii) the specification should be parsimonious enough to enable feasible estimation. However, the model should not be too simple to be able to capture the - possibly sophisticated - dynamics in the covariance structure.
Moreover, it may be useful to have the so-called stability by aggregation property. If t satisfies (11.5), the process (
) defined by
= P
t, where P is an invertible square matrix, is such that
The stability by aggregation of a class of specifications for Ht requires that the conditional variance matrices belong to the same class for any choice of P. This property is particularly relevant in finance because if the components of the vector
t are asset returns,
is a vector of portfolios of the same assets, each of its components consisting of amounts (coefficients of the corresponding row of P) of the initial assets.
A popular specification, known as the diagonal representation, is obtained by assuming that each element hk,t of the covariance matrix ht is formulated in terms only of the product of the prior k and
returns. Specifically,
![c11ue003_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue003_fmt.jpg)
with ωk = ω
k,
,
for all (k,
). For m = 1 this model coincides with the usual univariate formulation. When m > 1 the model obviously has a large number of parameters and will not in general produce positive definite covariance matrices Ht. We have
![c11ue004_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue004_fmt.jpg)
where denotes the Hadamard product, that is, the element by element product.2 Thus, in the ARCH case (p = 0), sufficient positivity conditions are that
is positive definite and the A(i) are positive semi-definite, but these constraints do not easily generalize to the GARCH case. We shall give further positivity conditions obtained by expressing the model in a different way, viewing it as a particular case of a more general class.
It is easy to see that the model is not stable by aggregation: for instance, the conditional variance of 1,t +
2,t can in general be expressed as a function of the
and
but not of the (
1,t−i +
2,t−i)2. A final drawback of this model is that there is no interaction between the different components of the conditional covariance, which appears unrealistic for applications to financial series.
In what follows we present the main specifications introduced in the literature, before turning to the existence of solutions. Let η denote a probability distribution on m, with zero mean and unit covariance matrix.
The vector GARCH (VEC-GARCH) model is the most direct generalization of univariate GARCH: every conditional covariance is a function of lagged conditional variances as well as lagged cross-products of all components. In some sense, everything is explained by everything, which makes this model very general but also not very parsimonious.
Denote by vech(·) the operator that stacks the columns of the lower triangular part of its argument square matrix (if A = (aij), then vech(A) = (a11, a21, …, am1, a22, …, am2, …, amm)′). The next definition is a natural extension of the standard GARCH(p, q) specification.
Definition 11.3 (VEC-GARCH(p, q) process) Let (ηt) be a sequence of iid variables with distribution η. The process (t) is said to admit a VEC-GARCH(p, q) representation (relative to the sequence (ηt)) if it satisfies
where ω is a vector of size {m(m + l)/2} × 1, and the A(i) and B(j) are matrices of dimension m(m + l)/2 × m(m + l)/2.
Remark 11.1 (The diagonal model is a special case of the VEC-GARCH model) The diagonal model admits a vector representation, obtained for diagonal matrices A(i) and B(j).
We will show that the class of VEC-GARCH models is stable by aggregation. Recall that the vec(·) operator converts any matrix to a vector by stacking all the columns of the matrix into one vector. It is related to the vech operator by the formulas
where A is any m × m symmetric matrix, Dm is a full-rank m2 × m(m + l)/2 matrix (the so-called ‘duplication matrix’), whose entries are only 0 and 1, = (D′mDm)−1D′m.3 We also have the relation
where denotes the Kronecker matrix product,4 provided the product ABC is well defined.
Theorem 11.2 (The VEC-GARCH is stable by aggregation) Let (t) be a VEC-GARCH(p, q) process. Then, for any invertible m × m matrix P, the process
= P
t is a VEC-GARCH(p, q) process.
Proof. Setting = PHtP′, we have
t =
ηt and
![c11ue006_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue006_fmt.jpg)
where
![c11ue007_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue007_fmt.jpg)
To derive the form of we use
![c11ue008_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue008_fmt.jpg)
and for we use
![c11ue009_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue009_fmt.jpg)
![Box_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__Box_fmt.jpg)
Positivity Conditions
We now seek conditions ensuring the positivity of Ht. A generic element of
![c11ue010_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue010_fmt.jpg)
is denoted by hk,t (k ≥
) and we will denote by
the entry of A(i) (B(j)) located on the same row as hk
,t and belonging to the same column as the element hk′
′,t of h′t. We thus have an expression of the form
![c11ue011_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue011_fmt.jpg)
Denoting by the m × m symmetric matrix with (k′
′)th entries
/2, for k′ ≠
′ and the elements
on the diagonal, the preceding equality is written as
In order to obtain a more compact form for the last part of this expression, let us introduce the spectral decomposition of the symmetric matrices Ht, assumed to be positive semi-definite. We have where
is an orthogonal matrix of eigenvectors
associated with the (positive) eigenvalues
of Ht. Defining the matrices
by analogy with the
, we get
Finally, consider the m2 × m2 matrix admitting the block form , and let
. The preceding expressions are equivalent to
where is the symmetric matrix such that vech(
) = ω.
In this form, it is evident that the assumption
ensures that if the Ht−j are almost surely positive definite, then so is Ht.
Example 11.1 (Three representations of a vector ARCH(l) model) For p = 0, q = 1 and m = 2, the conditional variance is written, in the form (11.9), as
![c11ue012_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue012_fmt.jpg)
in the form (11.12) as
![c11ue013_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue013_fmt.jpg)
and in the form (11.14) as
![c11ue014_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue014_fmt.jpg)
This example shows that, even for small orders, the VEC model potentially has an enormous number of parameters, which can make estimation of the parameters computationally demanding. Moreover, the positivity conditions are not directly obtained from (11.9) but from (11.14), involving the spectral decomposition of the matrices Ht−j.
The following classes provide more parsimonious and tractable models.
11.2.3 Constant Conditional Correlations Models
Suppose that, for a multivariate GARCH process of the form (11.6), all the past information on kt, involving all the variables
,t−i, is summarized in the variable hkk,t, with Ehkk,t =
Then, letting
, we define for all k a sequence of iid variables with zero mean and unit variance. The variables
are generally correlated, so let
, where
. The conditional variance of
![c11ue015_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue015_fmt.jpg)
is then written as
By construction, the conditional correlations between the components of t are time-invariant:
![c11ue016_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue016_fmt.jpg)
To complete the specification, the dynamics of the conditional variances hkk,t has to be defined. The simplest constant conditional correlations (CCC) model relies on the following univariate GARCH specifications:
where ωk > 0, ak,i ≥ 0, bk,j ≥ 0, − 1 ≤ ρk ≤ 1, ρkk = 1, and R is symmetric and positive semi-definite. Observe that the conditional variances are specified as in the diagonal model. The conditional covariances clearly are not linear in the squares and cross products of the returns.
In a multivariate framework, it seems natural to extend the specification (11.17) by allowing hkk,t to depend not only on its own past, but also on the past of all the variables ,t. Set
![c11ue017_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue017_fmt.jpg)
Definition 11.4 (CCC-GARCH(p, q) process) Let (ηt) be a sequence of iid variables with distribution η. A process (t) is called CCC-GARCH(p, q) if it satisfies
where R is a correlation matrix, a m × 1 vector with positive coefficients, and the Ai; and Bj are m × m matrices with nonnegative coefficients.
We have , where
= R½ηt is a centered vector with covariance matrix R. The components of
t thus have the usual expression,
, but the conditional variance hkk,t depends on the past of all the components of
t.
Note that the conditional covariances are generally nonlinear functions of the components of t−i
′t−i and of past values of the components of Ht. Model (11.18) is thus not a VEC-GARCH model, defined by (11.9), except when R is the identity matrix.
One advantage of this specification is that a simple condition ensuring the positive definiteness of Ht is obtained through the positive coefficients for the matrices Ai and Bj and the choice of a positive definite matrix for R. We shall also see that the study of the stationarity is remarkably simple.
Two limitations of the CCC model are, however, (i) its nonstability by aggregation and (ii) the arbitrary nature of the assumption of constant conditional correlations.
11.2.4 Dynamic Conditional Correlations Models
Dynamic conditional correlations GARCH (DCC-GARCH) models are an extension of CCC-GARCH, obtained by introducing a dynamic for the conditional correlation. Hence, the constant matrix R in Definition 11.4 is replaced by a matrix Rt which is measurable with respect to the past variables {u, u < t}. For reasons of parsimony, it seems reasonable to choose diagonal matrices Ai and Bi in (11.18), corresponding to univariate GARCH models for each component as in (11.17). Different DCC models are obtained depending on the specification of Rt. A simple formulation is
where the θi are positive weights summing to 1, R is a constant correlation matrix, and Ψt−1 is the empirical correlation matrix of t−1,…,
t−M. The matrix Rt is thus a correlation matrix (see Exercise 11.9). Equation (11.19) is reminiscent of the GARCH(1, 1) specification, θ1R playing the role of the parameter ω, θ2 that of α, and θ3 that of β.
Another way of specifying the dynamics of Rt is by setting
![c11ue018_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue018_fmt.jpg)
where diag Qt is the diagonal matrix constructed with the diagonal elements of Qt, and Qt is a sequence of covariance matrices which is measurable with respect to σ (u, u < t). A natural parameterization is
where Q is a covariance matrix. Again, the formulation recalls the GARCH(1, 1) model. Though different, both specifications (11.19) and (11.20) allow us to test the assumption of constant conditional covariance matrix, by considering the restriction θ2 = θ3 = 0. Note that the same θ2 and θ3 coefficients appear in the different conditional correlations, which thus have very similar dynamics. The matrices R and Q are often estimated/replaced by the empirical correlation and covariance matrices. In this approach a DCC model of the form (11.19) or (11.20) thus introduces only two more parameters than the CCC formulation.
The BEKK acronym refers to a specific parameterization of the multivariate GARCH model developed by Baba, Engle, Kraft and Kroner, in a preliminary version of Engle and Kroner (1995).
Definition 11.5 (BEKK-GARCH(p, q) process) Let (ηt) denote an iid sequence with common distribution η. The process (t) is called a strong GARCH(p, q), with respect to the sequence (
t), if it satisfies
![c11ue019_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue019_fmt.jpg)
where K is an integer, , Aik and Bjk are square m × m matrices, and
is positive definite.
The specification obviously ensures that if the matrices Ht−i, i = 1, …, p, are almost surely positive definite, then so is Ht.
To compare this model with the representation (11.9), let us derive the vector form of the equation for Ht. Using the relations (11.10) and (11.11), we get
![c11ue020_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue020_fmt.jpg)
The model can thus be written in the form (11.9), with
for i = 1,…, q and j = 1,…, p. In particular, it can be seen that the number of coefficients of a matrix A(i) in (11.9) is [m(m + l)/2]2, whereas it is Km2 in this particular case.
The BEKK class contains (Exercise 11.13) the diagonal models obtained by choosing diagonal matrices Aik and Bjk. The following theorem establishes a converse to this property.
Theorem 11.3 For the model defined by the diagonal vector representation (11.9) with
![c11ue021_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue021_fmt.jpg)
where and
are m × m symmetric positive semi-definite matrices, there exist matrices Aik and Bjk such that (11.21) holds, for K = m.
Proof. There exists an upper triangular matrix
![c11ue022_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue022_fmt.jpg)
such that . Let
where r = m − k + 1, for k = 1.,…, m. It is easy to show that the first equality in (11.21) is satisfied with K = m. The second equality is obtained similarly.
Example 11.2 By way of illustration, consider the particular case where m = 2, q = K = 1 and p = 0. If A = (aij) is a 2 × 2 matrix, it is easy to see that
![c11ue023_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue023_fmt.jpg)
Hence, canceling out the unnecessary indices,
![c11ue024_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue024_fmt.jpg)
In particular, the diagonal models belonging to this class are of the form
![c11ue025_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue025_fmt.jpg)
Remark 11.2 (Interpretation of the BEKK coefficients) Example 11.2 shows that the BEKK specification imposes highly artificial constraints on the volatilities and covolatilities of the components. As a consequence, the coefficients of a BEKK representation are difficult to interpret.
Remark 11.3 (Identifiability) Identifiability of a BEKK representation requires additional constraints. Indeed, the same representation holds if Aik is replaced by −Aik, or if the matrices A1k,…, Aqk and A1k′,…, Aqk′ are permuted for k ≠ k′.
Example 11.3 (A general and identifiable BEKK representation) Consider the case m = 2, q = 1 and p = 0. Suppose that the distribution η is nondegenerate, so that there exists no nontrivial constant linear combination of a finite number of the k,t−i
,t−i Let
![c11ue026_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue026_fmt.jpg)
where is a symetric positive definite matrix,
![c11ue027_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue027_fmt.jpg)
with a11,1 ≥ 0, a12,3 ≥ 0, a21,2 ≥ 0 and a241,4 ≥ 0.
Let us show that this BEKK representation is both identifiable and quite general. Easy, but tedious, computation shows that an expression of the form (11.9) holds with
![c11ue028_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue028_fmt.jpg)
In view of the sign constraint, the (1, l)th element of A(1) allows us to identify a11,1. The (1, 2)th and (2, l)th elements then allow us to find a12,1 and a21,1, whence the (2, 2)th element yields a22,1. The two elements of A3 are deduced from the (1, 3)th and (2, 3)th elements of A(1) and from the constraint a12,3 ≥ 0 (which could be replaced by a constraint on the sign of a22,3). A2 is identified similarly, and the nonzero element of A4 is finally identified by considering the (3, 3)th element of A(1).
In this example, the BEKK representation contains the same number of parameters as the corresponding VEC representation, but has the advantage of automatically providing a positive definite solution Ht.
It is interesting to consider the stability by aggregation of the BEKK class.
Theorem 11.4 (Stability of the BEKK model by aggregation) Let (t) be a BEKK-GARCH (p,q) process. Then, for any invertible m × m matrix P, the process
= P
t is a BEKK-GARCH (p, q) process.
Proof. Letting = PHtP′,
= P
P′,
= PAikP−1 and
= PBjkP−1 we get
![c11ue029_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue029_fmt.jpg)
and, being a positive definite matrix, the result is proved.
As in the univariate case, the ‘square’ of the (t) process is the solution of an ARMA model. Indeed, define the innovation of the process vec(
t
′t):
Applying the vec operator, and substituting the variables vec(Ht−j) in the model of Definition 11.5 by , we get the representation
where r = max(p, q), with the convention Aik = 0 (Bjk = 0) If i > q (j > p). This representation cannot be used to obtain stationarity conditions because the process (vt) is not iid in general. However, it can be used to derive the second-order moment, when it exists, of the process t as
![c11ue030_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue030_fmt.jpg)
that is,
![c11ue031_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue031_fmt.jpg)
provided that the matrix in braces is nonsingular.
In these models, it is assumed that a nonsingular linear combination ft of the m components of t, or an exogenous variable summarizing the comovements of the components, has a GARCH structure.
Factor models with idiosyncratic noise
A very popular factor model links individual returns it to the market return ft through a regression model
The parameter βi can be Interpreted as a sensitivity to the factor, and the noise ηit as a specific risk (often called idiosyncratic risk) which is conditionally uncorrelated with ft. It follows that Ht = + λtββ′ whereβ is the vector of sensitivities, λt is the conditional variance of ft and
is the covariance matrix of the idiosyncratic terms. More generally, assuming the existence of r conditionally uncorrelated factors, we obtain the decomposition
It is not restrictive to assume that the factors are linear combinations of the components of t (Exercise 11.10). If, in addition, the conditional variances λjt are specified as univariate GARCH, the model remains parsimonious in terms of unknown parameters and (11.25) reduces to a particular BEKK model (Exercise 11.11). If
is chosen to be positive definite and if the univariate series (λjt)t, j = 1,…, r, are independent, strictly and second-order stationary, then it is clear that (11.25) defines a sequence of positive definite matrices (Ht) that are strictly and second-order stationary.
Principal components GARCH model
The concept of factor is central to principal components analysis (PCA) and to other methods of exploratory data analysis. PCA relies on decomposing the covariance matrix V of m quantitative variables as V = PΛP′ where Λ is a diagonal matrix whose elements are the eigenvalues λ1 ≥ λ2 ≥ … ≥ λm of V, and where P is the orthonormal matrix of the corresponding eigenvectors. The first principal component is the linear combination of the m variables, with weights given by the first column of P, which, in some sense, is the factor which best summarizes the set of m variables (Exercise 11.12). There exist m principal components, which are uncorrelated and whose variances λ1,…, λm (and hence whose explanatory powers) are in decreasing order. It is natural considering this method for extracting the key factors of the volatilities of the m components of t.
We obtain a principal component GARCH (PC-GARCH) or orthogonal GARCH (O-GARCH) model by assuming that
where P is an orthogonal matrix (P′ = P−1) and Λt = diag (λ1t,…, λmt), where the λit are the volatilities, which can be obtained from univariate GARCH-type models. This is equivalent to assuming
where ft = P′it is the principal component vector, whose components are orthogonal factors. If univariate GARCH(1, 1) models are used for the factors
then
Remark 11.4 (Interpretation, factor estimation and extensions)
1. Model (11.26) can also be interpreted as a full-factor GARCH (FF-GARCH) model, that is, a model with as many factors as components and no idiosyncratic term. Let P(·, j) be the jth column of P (an eigenvector of Ht associated with the eigenvalue λjt). We get a spectral expression for the conditional variance,
![c11ue032_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue032_fmt.jpg)
which is of the form (11.25) with an idiosyncratic variance = 0.
2. A PCA of the conditional variance Ht should, in full generality, give Ht = PtΛtP′t with factors (that is, principal components) ft = P′tt. Model (11.26) thus assumes that all factors are linear combinations, with fixed coefficients, of the same returns
it. For instance, the first
factor f1t is the conditionally most risky factor (with the largest conditional variance λ1t, see Exercise 11.12). But since it is assumed that the direction of f1t is fixed, in the subspace of m generated by the components of
it, the first factor is also the most risky unconditionally. This can be seen through the PCA of the unconditional variance H = EHt = PΛP′, which is assumed to exist.
3. It is easy to estimate P by applying PCA to the empirical variance , where
. The components of
are specified as GARCH-type univariate models. Estimation of the conditional variance
thus reduces to estimating m univariate models.
4. It is common practice to apply PCA on centered and standardized data, in order to remove the influence of the units of the various variables. For returns it, standardization does not seem appropriate if one wishes to retain a size effect, that is, if one expects an asset with a relatively large variance to have more weight in the riskier factors.
5. In the spirit of the standard PCA, it is possible to only consider the first r principal components, which are the key factors of the system. The variance Ht is thus approximated by
where the are estimated from simple univariate models, such as GARCH(1, 1) models of the form (11.28), the matrix
is obtained from PCA of the empirical covariance matrix
, and the factors are approximated by
. Instead of the approximation (11.29), one can use
The approximation in (11.30) is as simple as (11.29) and does not require additional computations (in particular, the r GARCH equations are retained) but has the advantage of providing an almost surely invertible estimation of Ht (for fixed n), which is required in the computation of certain statistics (such as the AIC-type information criteria based on the Gaussian log-likelihood).
6. Note that the assumption that P is orthogonal can be restrictive. The class of generalized orthogonal GARCH (GO-GARCH) processes assumes only that P is any nonsingular matrix.
In this section, we will first discuss the difficulty of establishing stationarity conditions, or the existence of moments, for multivariate GARCH models. For the general vector model (11.9), and in particular for the BEKK model, there exist sufficient stationarity conditions. The stationary solution being nonexplicit, we propose an algorithm that converges, under certain assumptions, to the stationary solution. We will then see that the problem is much simpler for the CCC model (11.18).
11.3.1 Stationarity of VEC and BEKK Models
It is not possible to provide stationary solutions, in explicit form, for the general VEC model (11.9). To illustrate the difficulty, recall that a univariate ARCH(l) model admits a solution t = σtηt with σt explicitly given as a function of {ηt−u, u > 0} as the square root of
![c11ue033_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue033_fmt.jpg)
provided that the series converges almost surely. Now consider a bivariate model of the form (11.6) with Ht = I2 + αt−1
′t−1 where α is assumed, for the sake of simplicity, to be scalar and positive. Also choose
to be lower triangular so as to have (11.7). Then
![c11ue034_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue034_fmt.jpg)
It can be seen that, given ηt−1, the relationship between h11,t and h11,t−1 is linear, and can be iterated to yield
![c11ue035_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue035_fmt.jpg)
under the constraint . In contrast, the relationships between h12,t, or h22,t, and the components of Ht−1 are not linear, which makes it impossible to express h12,t and h22,t as a simple function of α, {ηt−1, ηt−2, …, ηt−k} and Ht−k for k ≥ 1. This constitutes a major obstacle for determining sufficient stationarity conditions.
Remark 11.5 (Stationarity does not follow from the ARMA model) Similar to (11.22), letting , we obtain the ARMA representation
![c11ue036_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue036_fmt.jpg)
by setting C(i) = A(i) + B(i) and by using the usual notation and conventions. In the literature, one may encounter the argument that the model is weakly stationary if the polynomial z det(
) has all its roots outside the unit circle (s = m(m + l)/2). Although the result is certainly true with additional assumptions on the noise density (see Theorem 11.5 and the subsequent discussion), the argument is not correct since
![c11ue037_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue037_fmt.jpg)
constitutes a solution only if vt = vech (t
′t) – vech(Ht) can be expressed as a function of {ηt−u,u > 0}.
Boussama (2006) obtained the following stationarity condition. Recall that ρ(A) denotes the spectral radius of a square matrix A.
Theorem 11.5 (Stationarity and ergodicity) There exists a strictly stationary and nonanticipative solution of the vector GARCH model (11.9), if:
(i) the positivity condition (11.15) is satisfied;
(ii) the distribution of η has a density, positive on a neighborhood of 0, with respect to the Lebesgue measure on m;
(iii) ρs () > 1.
This solution is unique, β-mixing and ergodic.
In the particular case of the BEKK model (11.21), condition (III) takes the form
![c11ue038_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue038_fmt.jpg)
The proof of Theorem 11.5 relies on sophisticated algebraic tools. Assumption (ii) is a standard technical condition for showing the β-mixing property (but is of no use for stationarity). Note that condition (iii), written as < 1 in the univariate case, is generally not necessary for the strict stationarity.
This theorem does not provide explicit stationary solutions, that is, a relationship between t and the ηt−i. However, it is possible to construct an algorithm which, when it converges, allows a stationary solution to the vector GARCH model (11.9) to be defined.
Construction of a stationary solution
For any t, k
, we define
![c11ue039_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue039_fmt.jpg)
and, recursively on k ≥ 0,
with =
.
Observe that, for k ≥ 1,
![c11ue040_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue040_fmt.jpg)
where fk is a measurable function and H(k) is a square matrix. and
are thus stationary processes whose components take values in the Banach space L2 of the (equivalence classes of) square integrable random variables. It is then clear that (11.9) admits a strictly stationary solution, which is nonanticipative and ergodic, if, for all t,
Indeed, letting and
, and taking the limit of each side of (11.31), we note that (11.9) is satisfied. Moreover, (
t) constitutes a strictly stationary and nonanticipative solution, because
t is a measurable function of {ηu, u ≤ t}. In view of Theorem A.1, such a process is also ergodic. Note also that if Ht exists, it is symmetric and positive definite because the matrices
are symmetric and satisfy
![c11ue041_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue041_fmt.jpg)
This solution (t) is also second-order stationary if
Let
![c11ue042_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue042_fmt.jpg)
From Exercise 11.8 and its proof, we obtain (11.32), and hence the existence of strictly stationary solution to the vector GARCH equation (11.9), if there exists ρ ]0, 1[ such that
almost surely as k → ∞, which is equivalent to
Similarly, we obtain (11.33) if . The criterion in (11.34) is not very explicit but the left-hand side of the inequality can be evaluated by simulation, just as for a Lyapunov coefficient.
11.3.2 Stationarity of the CCC Model
In model (11.18), letting , we get
![c11ue043_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue043_fmt.jpg)
Multiplying by Υt the equation for , we thus have
![c11ue044_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue044_fmt.jpg)
which can be written
where
![c11ue045_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue045_fmt.jpg)
and
is a (p + q)m × (p + q)m matrix.
We obtain a vector representation, analogous to (2.16) obtained in the univariate case. This allows to state the following result.
Theorem 11.6 (Strict stationarity of the CCC model) A necessary and sufficient condition for the existence of a strictly stationary and nonanticipative solution process for model (11.18) is γ < 0, where y is the top Lyapunov exponent of the sequence {At, t
} defined in (11.36). This stationary and nonanticipative solution, when γ < 0, is unique and ergodic.
Proof. The proof is similar to that of Theorem 2.4. The variables ηt admitting a variance, the condition E log+ At
< ∞ is satisfied.
It follows that when γ < 0, the series
converges almost surely for all t. A strictly stationary solution to model (11.18) is obtained as where
denotes the (q + l)th subvector of size m of
This solution is thus nonanticipative and ergodic. The proof of the uniqueness is exactly the same as in the univariate case.
The proof of the necessary part can also be easily adapted. From Lemma 2.1, it is sufficient to prove that limt→∞ A0 … A−t
= 0. It suffices to show that, for 1 ≤ i ≤ p + q,
where and ei is the ith element of the canonical basis of
p+q, since any vector x of
m(p + q) can be uniquely decomposed as
where xi
m. As in the univariate case, the existence of a strictly stationary solution implies that A0 … A−k
−k−1 tends to 0, almost surely, as k → ∞. It follows that, using the relation
, we have
Since the components of are strictly positive, (11.38) thus holds for i = q + 1. Using
with the convention that , for i = 1 we obtain
![c11ue046_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue046_fmt.jpg)
where the inequalities are taken componentwise. Therefore, (11.38) holds true for i = q + 2, and by induction, for i = q + j, j = 1,…,p in view of (11.40). Moreover, since , (11.38) holds for i = q. We reach the same conclusion for the other values of i using an ascending recursion, as in the univariate case.
The following result provides a necessary strict stationarity condition which is simple to check.
Corollary 11.1 (Consequence of the strict stationarity) Let γ denote the top Lyapunov exponent of the sequence {At,t
} defined in (11.36). Consider the matrix polynomial defined by:
(z) = Im− zB1 − … − zpBp, z
. Let
![c11ue047_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue047_fmt.jpg)
Then, if γ< 0 the following equivalent properties hold:
1. The roots of det (z) are outside the unit disk.
2. ρ() < 1.
Proof. Because all the entries of the matrices At are positive, it is clear that γ is larger than the top Lyapunov exponent of the sequence (A*t) obtained by replacing the matrices Ai: by 0 in At. It is easily seen that the top Lyapunov coefficient of (A*t) coincides with that of the constant sequence equal to , that is, with ρ(
). It follows that γ ≥ log ρ(
). Hence γ < 0 entails that all the eigenvalues of
are outside the unit disk. Finally, in view of Exercise 11.14, the equivalence between the two properties follows from
![c11ue048_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue048_fmt.jpg)
![Box_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__Box_fmt.jpg)
Corollary 11.2 Suppose that γ < 0. Let t be the strictly stationary and nonanticipative solution of model (11.18). There exists s < 0 such that
and
.
Proof. It is shown in the proof of Corollary 2.3 that the strictly stationary solution defined by (11.37) satisfies for some s < 0. The conclusion follows from
and
.
11.4 Estimation of the CCC Model
We now turn to the estimation of the m-dimensional CCC-GARCH(p, q) model by the quasi-maximum likelihood method. Recall that (t) is called a CCC-GARCH(p, q) if it satisfies
where R is a correlation matrix, is a vector of size m × 1 with strictly positive coefficients, the Ai and Bj are matrices of size m × m with positive coefficients, and (ηt) is an iid sequence of centered variables in
m with identity covariance matrix.
As in the univariate case, the criterion is written as if the iid process were Gaussian.
The parameters are the coefficients of the matrices , Ai and Bj, and the coefficients of the lower triangular part (excluding the diagonal) of the correlation matrix R = (ρij). The number of unknown parameters is thus
![c11ue049_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue049_fmt.jpg)
The parameter vector is denoted by
![c11ue050_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue050_fmt.jpg)
where ρ′ = (ρ21 …, ρm1 ρ32,…, ρm2,…, ρm,m−1), αi = vec(Ai), i = l,…,q, and βj = vec(Bj), j = 1,…, p. The parameter space is a subspace Θ of
![c11ue051_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue051_fmt.jpg)
The true parameter valued is denoted by
![c11ue052_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue052_fmt.jpg)
Before detailing the estimation procedure and its properties, we discuss the conditions that need to be Imposed on the matrices Ai and Bj in order to ensure the uniqueness of the parameterization.
11.4.1 Identifiability Conditions
Let and
. By convention,
θ(z) = 0 if q = 0 and
θ(z) = Im if p = 0.
If θ(z) is nonsingular, that is, If the roots of det (
θ (z)) = 0 are outside the unit disk, we deduce from
the representation
In the vector case, assuming that the polynomials and
have no common root is insufficient to ensure that there exists no other pair (
θ,
θ), with the same degrees (p, q), such that
This condition is equivalent to the existence of an operator U(B) such that
this common factor vanishing in θ(B)−1
θ(B) (Exercise 11.2).
The polynomial U(B) is called unimodular if det{U(B)} is a nonzero constant. When the only common factors of the polynomials P(B) and Q(B) are unimodular, that is, when
![c11ue053_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue053_fmt.jpg)
then P(B) and Q(B) are called left coprime.
The following example shows that, in the vector case, assuming that (B) and
(B) are left coprime is insufficient to ensure that (11.43) has no solution θ ≠ θ0 (in the univariate case this is sufficient because the condition
θ(0) =
(0) = 1 imposes U(B) = U(0) = 1).
Example 11.4 (Nonidentifiable bivariate model) For m = 2, let
![c11ue054_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue054_fmt.jpg)
with
![c11ue055_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue055_fmt.jpg)
and
![c11ue056_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue056_fmt.jpg)
The polynomial (B) = U(B)
(B) has the same degree q as
(B), and
(B) = U(B)
(B) is a polynomial of the same degree p as
(B). On the other hand, U(B) has a nonzero determinant which is independent of B, hence it is unimodular. Moreover,
(0) =
(0) = Im and
(0) =
(0) = 0. It is thus possible to find θ such that
(B) =
θ(B),
(B) =
θ(B) and
= U(1)
0. The model is thus nonidentifiable, θ and θ0 corresponding to the same representation (11.42).
Identifiability can be ensured by several types of conditions; see Reinsel (1997, pp. 37–40), Hannan (1976) or Hannan and Deistler (1988, sec. 2.7). To obtain a mild condition define, for any column i of the matrix operators θ(B) and
θ(B), the maximal degrees qi (θ) and pi(θ), respectively. Suppose that maximal values are imposed for these orders, that is,
where qi ≤ q and pi ≤ p are fixed integers. Denote by aqi (i) (bpi (i)) the column vector of the coefficients of Bqi (Bpi) in the ith column of (B) (
(B)).
Example 11.5 (Illustration of the notation on an example) For
![c11ue057_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue057_fmt.jpg)
with a11a21a12a22b11b21b12b22 ≠ 0, we have
![c11ue058_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue058_fmt.jpg)
and
![c11ue059_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue059_fmt.jpg)
Proposition 11.1 (A simple identifiability condition) If the matrix
has full rank m, the parameters α0 and β0 are identified by the constraints (11.45) with qi = qi(θ0) and pi = pi(θ0) for any value of i.
Proof. From the proof of the theorem in Hannan (1969), U(B) satisfying (11.44) is a unimodular matrix of the form U(B) = U0 + U1B + … + UkBk. Since the term of highest degree (column by column) of , the ith column of
θ(B) = U(B)
(B) is a polynomial in B of degree less than qi if and only if Ujaqi (i) = 0, for j = 1,…, k. Similarly, we must have Ujbpi (i) = 0, for j = 1,…, k and i = 1,… m. It follows that UjM(
,
) = 0, which implies that Uj = 0 for j = 1,…, k thanks to condition (11.46). Consequently U(B) = U0 and, since, for all θ,
θ(0) = Im, we have U(B) = Im.
Example 11.6 (Illustration of the identifiability condition) In Example 11.4,
![c11ue060_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue060_fmt.jpg)
is not a full-rank matrix. Hence, the identifiability condition of Proposition 11.1 is not satisfied. Indeed, the model is not identifiable.
A simpler, but more restrictive, condition is obtained by imposing the requirement that
![c11ue061_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue061_fmt.jpg)
has full rank m. This entails uniqueness under the constraint that the degrees of θ and
θ are less than q and p, respectively.
Example 11.7 (Another illustration of the identifiability condition) Turning again to Example 11.5 with α12β21 = α22β11 and, for instance, α21 = 0 and α22 ≠ 0, observe that the matrix
![c11ue062_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue062_fmt.jpg)
does not have full rank, but the matrix
![c11ue063_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue063_fmt.jpg)
does have full rank.
More restrictive forms, such as the echelon form, are sometimes required to ensure identifiability.
11.4.2 Asymptotic Properties of the QMLE of the CCC-GARCH model
Let (1,…,
n) be an observation of length n of the unique nonanticipative and strictly stationary solution (
t) of model (11.41). Conditionally on nonnegative initial values
0,…,
1−q,
…
−p the Gaussian quasi-likelihood is written as
![c11ue064_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue064_fmt.jpg)
where the are recursively defined, for t ≥ 1, by
![c11ue065_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue065_fmt.jpg)
A QMLE of θ is defined as any measurable solution such that
where
![c11ue066_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue066_fmt.jpg)
Remark 11.6 (Choice of initial values) It will be shown later that, as in the univariate case, the initial values have no influence on the asymptotic properties of the estimator. These initial values can be fixed, for instance, so that
![c11ue067_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue067_fmt.jpg)
They can also be chosen as functions of θ, such as
![c11ue068_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue068_fmt.jpg)
or as random variable functions of the observations, such as
![c11ue069_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue069_fmt.jpg)
where the first r = max{p, q} observations are denoted by 1−r,…,
0.
Let γ (A0) denote the top Lyapunov coefficient of the sequence of matrices A0 = (A0t) defined as in (11.36), at θ = θ0. The following assumptions will be used to establish the strong consistency of the QMLE.
A1: θ0 Θ and Θ is compact.
A2: γ (A0) < 0 and, for all θ Θ, det
(z) = 0
|z| > 1.
A3: The components of ηt are independent and their squares have nondegenerate distributions.
A4: If p > 0, then (z) and
(z) are left coprime and M1(
,
) has full rank m.
A5: R is a positive definite correlation matrix for all θ Θ.
If the space Θ is constrained by (11.45), that is, if maximal orders are imposed for each component of and
in each equation, then assumption A4 can be replaced by the following more general condition:
A4′: If p > 0, then (z) and
(z) are left coprime and M(
,
) has full rank m.
It will be useful to approximate the sequence (t(θ)) by an ergodic and stationary sequence. Assumption A2 implies that, for all θ
Θ, the roots of
θ(z) are outside the unit disk. Denote by (
)t = {
(θ)}t the strictly stationary, nonanticipative and ergodic solution of
Now, letting Dt = {diag()}½ and Ht = DtRDt, we define
![c11ue070_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue070_fmt.jpg)
We are now in a position to state the following consistency theorem.
Theorem 11.7 (Strong consistency) Let () be a sequence of QMLEs satisfying (11.47). Then, under A1–A5 (or Al–A3, A4′; and A5),
![c11ue071_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue071_fmt.jpg)
To establish the asymptotic normality we require the following additional assumptions:
A6: θ0 , where
is the interior of Θ.
A7: .
Theorem 11.8 (Asymptotic normality) Under the assumptions of Theorem 11.7 and A6–A7, (
n − θ0) converges in distribution to
(0, J−1IJ−1), where J is A positive definite matrix and I is A positive semi-definite matrix, defined by
![c11ue072_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue072_fmt.jpg)
11.4.3 Proof of the Consistency and the Asymptotic Normality of the QML
We shall use the multiplicative norm (see Exercises 11.5 and 11.6) defined by
where A is a d1 × d2 matrix, x
is the Euclidean norm of vector
and ρ(·) denotes the spectral radius. This norm satisfies, for any d2 × d1 matrix B,
Proof of Theorem 11.7
The proof is similar to that of Theorem 7.1 for the univariate case.
Rewrite (11.48) in matrix form as
where B is defined in Corollary 11.1 and
We shall establish the intermediate results (a), (c) and (d) which are stated as in the univariate case (see Section 7.4 of Chapter 7), result (b) being replaced by
(b)′ {(θ) =
(θ0)
a.s. and R(θ) = R(θ0)}
θ = θ0.
Proof of (a): initial values are asymptotically irrelevant. In view of assumption A2 and Corollary 11.1, we have ρ() < 1. By the compactness of Θ, we even have
Iteratively using equation (11.52), as in the univariate case, we deduce that almost surely
where t, denotes the vector obtained by replacing the variables
by
in Ht,. Observe that K is a random variable that depends on the past values {
t,t ≤ 0}. Since K does not depend on n, it can be considered as A constant, such as ρ. From (11.55) we deduce that, almost surely,
Noting that R−1
is the inverse of the eigenvalue of smallest modulus of R, and that
, we have
using A5, the compactness of Θ and the strict positivity of the components of . Similarly, we have
Now
The first sum can be written as
![c11ue073_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue073_fmt.jpg)
as η → ∞, using (11.51), (11.56), (11.57), (11.58), the Cesàro lemma and the fact that ρtt
′t
= ρt
′t
t → 0 a.s.5 Now, using (11.50), the triangle inequality and, for x ≥ − 1, log(l + x) ≤ x, we have
![c11ue074_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue074_fmt.jpg)
![c11ue076_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue076_fmt.jpg)
Using (11.56), (11.57) and (11.58) again, we deduce that the second sum in (11.59) tends to 0. We have thus shown that almost surely as n → ∞,
![c11ue077_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue077_fmt.jpg)
Proof of (b)′: identifiability of the parameter. Suppose that, for some θ ≠ θ0,
![c11ue078_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue078_fmt.jpg)
Then it readily follows that ρ = ρ0 and, using the invertibility of the polynomial θ(B) under assumption A2, by (11.42),
![c11ue079_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue079_fmt.jpg)
that is,
![c11ue080_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue080_fmt.jpg)
Let . Noting that
0 =
(0) = 0 and isolating the terms that are functions of ηt−1
![c11ue081_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue081_fmt.jpg)
where Zt−2 belongs to the σ -field generated by {ηt−2,ηt−3 …}. Since ηt−1 is independent of this σ-field, Exercise 11.3 shows that the latter equality contradicts A3 unless, for i, j = 1, …, m, pijhjj,t = 0 almost surely, where the pij are the entries of 1. Because hjj,t − 0 for all j, we thus have
1 = 0. Similarly, we show that
(B) = 0 by successively considering the past values of ηt−1. Therefore, in view of A4 (or A4′), we have α = α0 and β = β0 (see Section 11.4.1). It readily follows that
. Hence θ = θ0 We have thus established (b)′.
Proof of (c): the limit criterion is minimized at the true value. As in the univariate case, we first show that t(θ) is well defined on
{+∞} for all θ, and on
forθ = θ0 We have
![c11ue082_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue082_fmt.jpg)
At θ0, Jensen’s inequality, the second inequality in (11.50) and Corollary 11.2 entail that
![c11ue083_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue083_fmt.jpg)
![c11ue084_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue084_fmt.jpg)
Because , the existence of
in
holds. It is thus not restrictive to study the minimum of
for the values of θ such that
. Denoting by λi,t, the positive eigenvalues of
(see Exercise 11.15), we have
![c11ue085_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue085_fmt.jpg)
because log x ≤ x − 1 for all x > 0. Since logx = x − 1 if and only if x = 1, the inequality is strict unless if, for all i, λit = 1 -a.s., that is, if Ht(θ) = Ht(θ0),
-a.s. (by Exercise 11.15). This equality is equivalent to
![c11ue086_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue086_fmt.jpg)
and thus to θ = θ0, from (by)′.
Proof of (d). The last part of the proof of the consistency uses the compactness of Θ and the ergodicity of (t(θ)), as in the univariate case.
Theorem 11.7 is thus established.
Proof of Theorem 11.8
We start by stating a few elementary results on the differentiation of expressions involving matrices. If f(A) is a real-valued function of a matrix A whose entries aij are functions of some variable x, the chain rule for differentiation of compositions of functions states that
Moreover, for A invertible we have
(a) First derivative of the criterion. Applying (11.60) and (11.61), then (11.62), (11.63) and (11.64), we obtain
for i = 1,…, s1 = m + (p + q)m2, and using (11.65),
for i = s1 + 1,…, s0. Letting D0t = Dt(θ0), R0 = R(θ0),
![c11ue087_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue087_fmt.jpg)
and , the score vector is written as
for i = 1,…, s1, and
for i = s1 + 1,…, s0.
(b) Existence of moments of any order for the score. In view of (11.51) and the Cauchy-Schwarz inequality, we obtain
![c11ue088_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue088_fmt.jpg)
for i, j = 1,…, s1,
![c11ue089_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue089_fmt.jpg)
for i = 1,…, s1 and j = s1 + 1,…, s0, and
![c11ue090_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue090_fmt.jpg)
for i, j = s1 + 1,…, s0. Note also that
![c11ue091_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue091_fmt.jpg)
To show that the score admits a second-order moment, it is thus sufficient to prove that
![c11ue092_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue092_fmt.jpg)
for all i1 = 1,…, m, all i = 1,…, s1 and r0 = 2. By (11.52) and (11.54),
![c11ue093_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue093_fmt.jpg)
and, setting s2 = m + qm2,
![c11ue094_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue094_fmt.jpg)
On the other hand, we have
![c11ue095_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue095_fmt.jpg)
where (i) = ∂
/∂θi; is a matrix whose entries are all 0, apart from a 1 located at the same place as θi in
. In an abuse of notation, we denote by Ht(i1) and
(i1) the i1th components of Ht and
(θ0). With arguments similar to those used in the univariate case, that is, the inequality x/(1 + x) ≤ xs for all x ≥ 0 and s
[0, 1], and the inequalities
![c11ue096_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue096_fmt.jpg)
and, setting ,
![c11ue097_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue097_fmt.jpg)
we obtain
![c11ue098_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue098_fmt.jpg)
where the constants ρj1 (which also depend on i1, s and r0) belong to the interval [0, 1). Noting that these inequalities are uniform on a neighborhood of θ0 , that they can be extended to higher-order derivatives, as in the univariate case, and that Corollary 11.2 implies that
, we can show a stronger result than the one stated: for all i1 = 1,…, m, all i, j, k = 1,…, s1 and all r0 ≥ 0, there exists a neighborhood
(θ0) of θ0 such that
and
(c) Asymptotic normality of the score vector. Clearly, {∂t(θ0)/∂θ}t is stationary and ∂
t(θ0)/∂θ is measurable with respect to the σ-field
t generated by {ηu, u ≤ t}. From (11.69) and (11.70), we have E {∂
t(θ0)/∂θ|
t−1} = 0. Property (b), and in particular (11.71), ensures the existence of the matrix
![c11ue099_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue099_fmt.jpg)
It follows that, for all , the sequence
is an ergodic, stationary and square integrable martingale difference. Corollary A.1 entails that
![c11ue100_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue100_fmt.jpg)
(d) Higher-order derivatives of the criterion. Starting from (a) and applying (11.60) and (11.65) several times, as well as (11.66), we obtain
![c11ue101_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue101_fmt.jpg)
where
![c11ue102_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue102_fmt.jpg)
and c3 is obtained by permuting t
′t and R−1 in C1. We also obtain
![c11ue103_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue103_fmt.jpg)
![c11ue104_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue104_fmt.jpg)
and c5 is obtained by permuting t
′t and ∂Dt/∂θi in C4. Results (11.71) and (11.72) ensure the existence of the matrix J := E∂2
t(θ0)/∂θ∂θ′, which is invertible, as shown in (e) below. Note that with our parameterization, ∂2R/∂θi∂θj = 0.
Continuing the differentiations, it can be seen that is also the trace of a sum of products of matrices similar to the Ci. The integrable matrix
t
′t appears at most once in each of these products. The other terms are, on the one hand, the bounded matrices R−1, ∂R/∂θi and
and, on the other hand, the matrices
and
. From (11.71)–(11.73), the norms of the latter three matrices admit moments at any orders in the neighborhood of θ0. This shows that
![c11ue105_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue105_fmt.jpg)
(e) Invertibility of the matrix J. The expression for J obtained in (d), as a function of the partial derivatives of Dt and R, is not in a convenient form for showing its invertibility. We start by writing J as a function of Ht and of its derivatives. Starting from
![c11ue106_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue106_fmt.jpg)
the differentiation formulas (11.60), (11.63) and (11.65) give
![c11ue107_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue107_fmt.jpg)
and then, using (11.64) and (11.66),
![c11ue108_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue108_fmt.jpg)
From the relation Tr(A′B) = (vecA)′vecB, we deduce that
![c11ue109_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue109_fmt.jpg)
where, using vec(ABC) = (C′ A)vec B,
![c11ue110_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue110_fmt.jpg)
Introducing the m2 × S0 matrices
![c11ue111_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue111_fmt.jpg)
we have h = Hd with Now suppose that J = Eh′h is singular. Then, there exists a nonzero vector
, such that c′Jc = Ec′h′hc = 0. Since c′h′hc ≥ 0 almost surely, we have
Because H2 is a positive definite matrix, with probability 1, this entails that dc = 0m2 with probability 1. Decompose c into c = (c′1, c′2)′ with , where S3 = S0 − S1 = m(mm − l)/2. Rows 1, m + 1,…, m2 of the equations
give
Differentiating equation (11.48) yields
![c11ue112_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue112_fmt.jpg)
where
![c11ue113_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue113_fmt.jpg)
Because (11.76) is satisfied for all t, we have
![c11ue114_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue114_fmt.jpg)
where quantities evaluated at θ = θ0 are indexed by 0. This entails that
![c11ue115_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue115_fmt.jpg)
and finally, introducing a vector θ1 whose S1 first components are vec ,
![c11ue116_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue116_fmt.jpg)
by choosing c1 small enough so that θ1 Θ. If c1 ≠ 0 then θ1 ≠ θ0 This is in contradiction to the identifiability of the parameter, hence c1 = 0. Equations (11.75) thus become
![c11ue117_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue117_fmt.jpg)
![c11ue118_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue118_fmt.jpg)
Because the vectors ∂vecR/∂θi, i = s1 + 1,…, s0, are linearly independent, the vector c2 = is nul, and thus c = 0. This contradicts (11.74), and shows that the assumption that J is singular is absurd.
(f) Asymptotic irrelevance of the initial values. First remark that (11.55) and the arguments used to show (11.57) and (11.58) entail that
and thus
From (11.52), we have
![c11ue119_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue119_fmt.jpg)
where r = max{p, q} and the tilde means that initial values are taken into account. Since for all t > r, we have
and
![c11ue120_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue120_fmt.jpg)
Thus (11.54) entails that
Because
![c11ue305001_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue305001_fmt.jpg)
we thus have (11.77), implying that
Denoting by (i1) the i1th component of
(θ0),
![c11ue121_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue121_fmt.jpg)
where C0 is a strictly positive constant and, by the usual convention, the index 0 corresponding to quantities evaluated at θ = θ0. For a sufficiently small neighborhood (θ0) of θ0, we have
![c11ue122_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue122_fmt.jpg)
for all i1, j1, j2 {1,…, m} and all δ > 0. Moreover, in
(i1), the coefficient of
is bounded below by a constant c > 0 uniformly on θ
(θ0). We thus have
![c11ue123_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue123_fmt.jpg)
for some ρ [0, 1), all δ > 0 and all s
[0, 1]. Corollary 11.2 then implies that, for all r0 ≥ 0,
![c11ue124_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue124_fmt.jpg)
From this we deduce that
The last inequality follows from (11.77) because
![c11ue125_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue125_fmt.jpg)
![c11ue126_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue126_fmt.jpg)
where
![c11ue127_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue127_fmt.jpg)
and C3 contains terms which can be handled as c1 and c2. Using (11.77)–(11.82), the Cauchy-Schwarz inequality, and
![c11ue128_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue128_fmt.jpg)
which follows from (11.71), we obtain
![c11ue129_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue129_fmt.jpg)
where ut is an integrable variable. From the Markov inequality, , which implies that
![c11ue130_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue130_fmt.jpg)
We have in fact shown that this convergence is uniform on a neighborhood of θ0, but this is of no direct use for what follows. By exactly the same arguments,
![c11ue131_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue131_fmt.jpg)
where is an integrable random variable, which entails that
![c11ue132_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue132_fmt.jpg)
It now suffices to observe that the analogs of steps (a)–(f) in Section 7.4 have been verified, and we are done.
Multivariate ARCH models were first considered by Engle, Granger and Kraft (1984), in the guise of the diagonal model. This model was extended and studied by Bollerslev, Engle and Woolridge (1988). The reader may refer to Hafner and Preminger (2009a), Lanne and Saikkonen (2007), van der Weide (2002) and Vrontos, Dellaportas and Politis (2003) for the definition and study of FF-GARCH models of the form (11.26) where p is not assumed to be orthonormal. The CCC-GARCH model based on (11.17) was introduced by Bollerslev (1990) and extended to (11.18) by Jeantheau (1998). A sufficient condition for strict stationarity and the existence of fourth-order moments of the CCC-GARCH(p, q) is established by Aue et al. (2009). The DCC formulations based on (11.19) and (11.20) were proposed, respectively, by Tse and Tsui (2002), and Engle (2002a). The single-factor model (11.24), which can be viewed as a dynamic version of the capital asset pricing model of Sharpe (1964), was proposed by Engle, Ng and Rothschild (1990). The main references on the O-GARCH or PC-GARCH, models are Alexander (2002) and Ding and Engle (2001). See van der Weide (2002) and Boswijk and van der Weide (2006) for references on the GO-GARCH model. Hafner (2003) and He and Terasvirta (2004) studied the fourth-order moments of multivariate GARCH models. Dynamic conditional correlations models were introduced by Engle (2002a) and Tse and Tsui (2002). These references, and those given in the text, can be complemented by the recent surveys by Bauwens, Laurent and Rombouts (2006) and Silvennoinen and Terasvirta (2008), and by the book by Engle (2009).
Jeantheau (1998) gave general conditions for the strong consistency of the QMLE for multivariate GARCH models. Comte and Lieberman (2003) showed the consistency and asymptotic normality of the QMLE for the BEKK formulation. Asymptotic results were established by Ling and McAleer (2003a) for the CCC formulation of an ARMA-GARCH, and by Hafner and Preminger (2009a) for a factor GARCH model of the FF-GARCH form. Theorems 11.7 and 11.8 are concerned with the CCC formulation, and allow us to study a subclass of the models considered by Ling and McAleer (2003a), but do not cover the models studied by Comte and Lieberman (2003) or those studied by Hafner and Preminger (2009b). Theorems 11.7–11.8 are mainly of interest because that they do not require any moment on the observed process and do not use high-level assumptions. For additional information on identifiability, in particular on the echelon form, one may for instance refer to Hannan (1976), Hannan and Deistler (1988), Lültkepohl (1991) and Reinsel (1997).
Portmanteau tests on the normalized residuals of multivariate GARCH processes were proposed, in particular, by Tse (2002), Duchesne and Lalancette (2003).
Bardet and Wintenberger (2009) established the strong consistency and asymptotic normality of the QMLE for a general class of multidimensional causal processes.
Among models not studied in this book are the spline GARCH models in which the volatility is written as a product of a slowly varying deterministic component and a GARCH-type component. These models were introduced by Engle and Rangel (2008), and their multivariate generalization is due to Hafner and Linton (2010).
11.1 (More or less parsimonious representations)
Compare the number of parameters of the various GARCH (p, q) representations, as a function of the dimension m.
11.2 (Identifiability of a matrix rational fraction)
Let denote square matrices of polynomials. Show that
for all z such that det if and only if there exists an operator U(z) such that
11.3 (Two independent nondegenerate random variables cannot be equal)
Let X and Y be two independent real random variables such that Y = X almost surely. We aim to prove that X and Y are almost surely constant.
1. Suppose that Var(X) exists. Compute Var(X) and show the stated result in this case.
2. Suppose that X is discrete and P(X = x1)P(X = X2) ≠ 0. Show that necessarily x1 = x2 and show the result in this case.
3. Prove the result in the general case.
11.4 (Duplication and elimination)
Consider the duplication matrix Dm and the elimination matrix defined by
![c11ue133_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue133_fmt.jpg)
where A is any symmetric m × m matrix. Show that
![c11ue134_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue134_fmt.jpg)
11.5 (Norm and spectral radius)
Show that
![c11ue135_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue135_fmt.jpg)
11.6 (Elementary results on matrix norms)
Show the equalities and inequalities of (11.50)–(11.51).
11.7 (Scalar GARCH)
The scalar GARCH model has a volatility of the form
![c11ue136_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue136_fmt.jpg)
where the αi and βj are positive numbers. Give the positivity and second-order stationarity conditions.
11.8 (Condition for the Lp and almost sure convergence)
Let p [1, ∞[ and let (un) be a sequence of real random variables of Lp such that
![c11ue137_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue137_fmt.jpg)
for some positive constant C, and some constant ρ in ]0, 1[. Prove that
![c11ue138_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue138_fmt.jpg)
to some random variable u of Lp.
11.9 (An average of correlation matrices is a correlation matrix)
Let R and Q be two correlation matrices of the same size and let p [0, 1]. Show that pR + (1 − p)Q is a correlation matrix.
11.10 (Factors as linear combinations of individual returns)
Consider the factor model
![c11ue139_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue139_fmt.jpg)
where the βj are linearly independent. Show there exist vectors αj such that
![c11ue140_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue140_fmt.jpg)
where the are conditional variances of the portfolios
. Compute the conditional covariance between these factors.
11.11 (BEKK representation of factor models)
Consider the factor model
![c11ue141_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue141_fmt.jpg)
where the βj are linearly independent, ωj > 0, aj ≥ 0 and 0 ≤ bj < 1 for j = 1, …, r. Show that a BEKK representation holds, of the form
![c11ue142_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue142_fmt.jpg)
11.12 (PCA of a covariance matrix)
Let X be a random vector of m with variance matrix Σ.
1. Find the (or a) first principal component of X, that is a random variable C1 = u′1X of maximal variance, where u′1u1 = 1. Is C1 unique?
2. Find the second principal component, that is, a random variable C2 = u′2X of maximal variance, where u′2u2 = 1 and Cov(C1, C2) = 0.
3. Find all the principal components.
11.13 (BEKK-GARCH models with a diagonal representation)
Show that the matrices A(i) and B(j) defined in (11.21) are diagonal when the matrices Aik and Bjk are diagonal.
11.14 (Determinant of a block companion matrix)
If A and D are square matrices, with D invertible, we have
![c11ue143_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue143_fmt.jpg)
Use this property to show that matrix B in Corollary 11.1 satisfies
![c11ue144_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue144_fmt.jpg)
11.15 (Eigenvalues of a product of positive definite matrices)
Let A and B denote symmetric positive definite matrices of the same size. Show that AB is diagonalizable and that Its eigenvalues are positive.
11.16 (Positive definiteness of a sum of positive semi-definite matrices)
Consider two matrices of the same size, symmetric and positive semi-definite, of the form
![c11ue145_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue145_fmt.jpg)
where A11 and B11 are also square matrices of the same size. Show that if A22 and B11 are positive definite, then so is A + B.
11.17 (Positive definite matrix and almost surely positive definite matrix)
Let A by a symmetric random matrix such that for all real vectors c ≠ 0,
![c11ue146_fmt](http://images-20200215.ebookreading.net/1/2/2/9780470683910/9780470683910__garch-models__9780470683910__images__c11ue146_fmt.jpg)
Show that this does not entail that A is almost surely positive definite.
1 The choice is then unique because to any positive definite matrix A, one can associate a unique positive definite matrix R such that A = R2 (see Harville, 1997, Theorem 21.9.1). We have R = P Λ½ P′, where Λ½ is a diagonal matrix, with diagonal elements the square roots of the eigenvalues of A, and P is the orthogonal matrix of the corresponding eigenvectors.
2 For two matrices A = (aij) and B = (bij) of the same dimension, A B = (aijbij).
3 For instance, More generally, for i ≥ j, the [(j − 1)m + i]th and [(i − 1)m + j]th rows of Dm equal the m(m + l)/2-dimensional row vector all of whose entries are null, with the exception of the [(j − 1)(m − j/2) + i ]th, equal to 1.
4 If A = (aij) is an m × n matrix and B is an m′ × n′ matrix, A B is the mm′ × nn′matrix admitting the block elements aijB.
5 The latter statement can be shown by using the Borel-Cantelli lemma, the Markov inequality and by applying Corollary 11.2: