Chapter Ten

Gaussian Random Vectors

10.1 Introduction/Purpose of the Chapter

Gaussian random variables and Gaussian random vectors (vectors whose components are jointly Gaussian, as defined in this chapter) play a central role in modeling real-life processes. Part of the reason for this is that noise like quantities encountered in many practical applications are reasonably modeled as Gaussian. Another reason is that Gaussian random variables and vectors turn out to be remarkably easy to work with (after an initial period of learning their features). Jointly Gaussian random variables are completely described by their means and covariances, which is part of the simplicity of working with them. Estimating these joint Gaussians means approximating only their means and covariances.

A third reason why Gaussian random variables and vectors are so important is that, in many cases, the performance measures we get for estimation and detection problems for the Gaussian case often bounds the performance for other random variables with the same means and covariances. For example, the minimum mean square estimator for Gaussian problems is the same as the linear least squares estimator for other problems with the same mean and covariance and, furthermore, has the same mean square performance. We will also find that this estimator has a simple expression as a linear function of the observations. Finally, we will find that the minimum mean square estimator for non-Gaussian problems always has a better performance than that for Gaussian problems with the same mean and covariance, but that the estimator is typically much more complex. The point of this example is that non-Gaussian problems are often more easily and more deeply understood if we first understand the corresponding Gaussian problem.

In this chapter, we develop the most important properties of Gaussian random variables and vectors, namely the moment-generating function, the moments, the joint densities, and the conditional probability densities.

10.2 Vignette/Historical Notes

The Gaussian distribution is named after Johann Carl Friedrich Gauss (30 April 1777–23 February 1855), one of the most famous mathematicians and physical scientists of the 18th century. Besides 301 probability and statistics, Gauss contributed significantly to many other fields, including number theory, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy, and optics.

Gauss work on a theory of the motion of planetoids disturbed by large planets, eventually published in 1809 as Theoria motus corporum coelestium in sectionibus conicis solem ambientum (Theory of motion of the celestial bodies moving in conic sections around the sun), contained an influential treatment of the method of least squares, a procedure which is used today in all sciences to minimize the impact of measurement error. Gauss proved the method under the assumption of normally distributed errors, a methodology that today is the first step in analysis of errors produced by very complex processes.

With the proper formulation of the Brownian motion (Wiener process) by Norbert Wiener in the 1950s the Gaussian process became mainstream for the theory of stochastic processes. The chaos theory developed in the 1970s owes a lot of gratitude to Gaussian processes to be able to produce useful formulas and bounds. Most of the machine learning theory uses assumptions about errors behaving like Gaussian processes. For all these reasons, learning about them is a crucial skill to be acquired by any aspiring student in probability.

10.3 Theory and Applications

10.3.1 The Basics

Let us first recall that a random variable follows the normal law N(μ, σ2) with σ > 0 if its density is given by

img

Before introducing the concept of Gaussian vector, let us list some properties of Gaussian random variables, which are already proven in the previous chapters of this book.

  • The parameters μ and σ completely characterize the law of X.
  • XN(μ, σ2) if and only if

img

where ZN(0, 1) is a standard normal random variable (a N(0, 1) r.v.). See Remark 5.27.
  • If XN(μ, σ2) and we denote μp = E(XEX)p = E(Xμ)p, the p-central moment, then

img

and

img

Note in particular that the third central moment is zero, and the fourth central moment is equal to 3σ4. We point this out because these are crucial to describe the shape of distributions. The third standardized moment of a random variable X is defined as

img

and is a measure of the skew of a distribution (how much it deviates from being symmetric). This measure is typically compared to 0 which shows very little skew. A negative value of the measure indicates that the random variable is more likely.
The fourth standardized moment of a random variable X is defined as

img

and is a measure of the tails of the distribution compared with the tails of a Gaussian random variable. Once again, recall that the kurtosis for a Gaussian is equal to 3.
A random variable with Kurtosis(X) > 3 is said to have a leptokurtic distribution—in essence the random variable takes large values more often than an equivalent normal variable. These random variables are crucial in finance where the observed behavior of returns calculated from price processes have this property. The Cauchy and t distributions are examples of distributions having this property.
A random variable with Kurtosis(X) < 3 is said to have a platykurtic distribution (extreme observations are less frequent than a corresponding normal). An example is the uniform distribution on a finite interval.
  • If XN(μ, σ2), then

img

for every img, a < b, where

img

is the cumulative distribution function of the standard normal N(0, 1) (its values can be usually found in the table of the normal distribution).
  • The sum of two independent normal random variables is normal; that is, if img, img and X, Y are independent, then

img

The result has been proven in Proposition 7.29. Clearly, the result can be easily extended by induction to finite sums of independent Gaussian random variables. That is, if img for i = 1, …, n, then

img

  • The characteristic function of XN(μ, σ2) is
for every img.
  • The moment-generating function of XN(0, 1) is (see Proposition 9.16)

img

for every img.

10.3.2 Equivalent Definitions of a Gaussian Vector

Let us now define the Gaussian vector.


Definition 10.1 A random vector X = (X1, …, Xd) defined on the probability space img is Gaussian if every linear combination of its components is a Gaussian random variable. That is, for every img the r.v.
img
is Gaussian.

The following two facts are immediate.


Proposition 10.2
1. If X = (X1, …, Xd), then Xi is a Gaussian random variable for every i = 1, …, d.
2. If Xi is a Gaussian random variable for every i = 1, …, d and the r.v. Xi are mutually independent, then the vector
img
is a Gaussian vector.

Proof: 1. Indeed, this follows from the definition by choosing αi = 1 and αj = 0 for every ji, j = 1, …, d.

2. It is a consequence of the definition of the Gaussian vector and of the Proposition 7.29. We mention that the assumption that Xi are independent is crucial. In this chapter, we will see examples showing that it is possible to have random vectors with each component Gaussian which are not Gaussian. img


Remark 10.3 A standard normal Gaussian vector is a random vector
img
such that XiN(0, 1) for every i = 1, …, d and Xi are independent. In this case the mean of X is img and the covariance matrix is Id, the unit matrix (matrix with 1 on the main diagonal and zero everywhere else). We denote
img

As we mention above, the mean and the variance characterize the law of a Gaussian random variable (one-dimensional). In the multidimensional case, the mean (which is a vector) and the covariance matrix will completely determine the law of a Gaussian vector.

Recall that if X = (X1, …, Xd) is a random vector, then

img

and the covariance matrix of X, denoted by ΛX = (ΛX(i, j))i,j=1,…,d, is defined by

img

for every i, j = 1, …, d.

Let us first remark that the mean and the covariance matrix of a Gaussian vector entirely characterize the first two moments (and thus the distribution) of every linear combination of the components of the vector. We recall the notation of a scalar product of two vectors:

img

if img, and we denoted by xT the transpose of the d × 1 matrix x.


Proposition 10.4 Let X = (X1, …, Xd) be a d-dimensional Gaussian vector with mean vector μ and covariance matrix ΛX. Let img. Define
img
a linear combination of the vector components. Then
img

Proof: It follows from Definition 10.1 that Y is a Gaussian r.v. (a linear combination of Gaussian random variables). It remains to compute its expectation and its variance. First

img

and

img

by noticing that the components of the vector img are

img

img

Using Proposition 10.4, it is possible to obtain the characteristic function of a Gaussian vector.


Theorem 10.5 Let X = (X1, …, Xd) be a Gaussian vector and denote by μ its expectation vector and by ΛX its covariance matrix. Then the characteristic function of X is
(10.2) equation
for every img. Here we applied Proposition 8.23.

Proof: By definition, we have

img

Using Proposition 10.4, we have that, by Eq. (10.1),

img

where Y =img X, u img. It suffices to note that

img

for every img. img


Remark 10.6 Often in the literature, a Gaussian vector is defined through its characteristic function given in Theorem 10.5. That is, a random vector X is called a Gaussian random vector if its characteristic function is given by
img
for some vector μ and some symmetric positive definite matrix Σ.


Theorem 10.7 If X is a d-dimensional Gaussian vector, then there exists a vector img and a d-dimensional square matrix img such that
img
(where = (d) stands for the equality in distribution) where NN(0, Id).

Proof: Let img and define

img

For every img we have

img

Suppose that X is a Gaussian vector. Since ΛX is symmetric and positive definite, there exists img such that

img

Let N1N(0, Id). We apply Theorem 10.5 and the beginning of this proof. It follows that X and m + AN1 have the same characteristic function and therefore the same law. img

The converse of Theorem 10.7 is also true.


Theorem 10.8 If
img
with img, img, and NN(0, 1), then X is a Gaussian vector.

Proof: Let

img

and consider the linear combination

img

where (AN)i is the component i of the vector img, i = 1, …, d. Then, due to Proposition 10.4, we have

img

So every linear combination of the components of X is a Gaussian random variable, which means that X is a Gaussian vector. img

By putting together the results in Theorems 10.5, 10.7, and 10.8, we obtain three alternative characterization of a Gaussian vector.


Theorem 10.9 Let X = (X1, …, Xd) be a random vector with μ = EX the mean vector and ΛX the covariance matrix. Then the following are equivalent.
1. X is a Gaussian vector.
2. The characteristic function of X is given by
img
for every img.
3. There exist a vector img and a matrix img such that
img
(equality in distribution), where ZN(0, Id) is a standard Gaussian vector.

Proof: The implication 1 img 2 follows from Theorem 10.5. The implication 2 img 3 is a consequence of the proof of Theorem 10.7. The implication 3 img 1 has been showed in Theorem 10.8. img


Remark 10.10 Let us notice that the sum of two independent Gaussian vectors is a Gaussian vector. This follows easily from the definition of a Gaussian vector and the additivity property of the Gaussian random variables.


Example 10.1 A Vector with all Components Gaussian but Which is not a Gaussian Vector
Let us give an example of a vector with each component Gaussian which is not a Gaussian vector. Consider N1N(0, 1) and define the random variable N2 by
img
(a > 0 is some fixed constant) and
img
Let us show that N2 is a Gaussian random variable. We compute its cumulative distribution function. It holds that
img
As a consequence, N2 has the same law as N1, so
img
However, the vector
img
is not a Gaussian vector. Indeed, the sum
img
(which constitutes a linear combination of the components of (N1, N2)) is such that
img
so S has strictly positive probability for a fixed value; thus it cannot be a random variable with a normal density (or any other continuous density).


Remark 10.11 See also Example 10.5 for a situation when a non-Gaussian random vector has Gaussian marginals.

10.3.3 Uncorrelated Components and Independence

One of the most important properties of a Gaussian vector is the fact that its components are independent if and only if they are uncorrelated. One direction is always true for every random variable: If two random variables are independent, then they are uncorrelated. On the other hand, the converse is strongly related to the structure of the Gaussian vector, and it does not hold in general for other random variables. In Example 10.5 we show that that there exist Gaussian random variables (without the gaussian vector structure) which are uncorrelated but are not independent.


Theorem 10.12 Let X = (X1, …, Xd) be a d-dimensional Gaussian vector. Then for every i, j = 1, …, d, ij, the random variables Xi and Xj are independent if and only if
img

Proof: If Xi is independent by Xj, then clearly Cov(Xi, Xj) = 0, ij.

Suppose that Cov(Xi, Xj) = 0, ij. Denote by μ = EX and ΛX = (λi,j)i,j=1,…,d the mean vector and covariance matrix of the vector X. Since the individual covariances are zero, this matrix is a diagonal matrix and we have

img

for every img. We used the fact that for every j = 1, …, d, one has

img

Since the characteristic function of the vector is the product of individual characteristic functions, the components of the vector X are independent. img


Example 10.2
Let X, Y be two independent standard normal random variables and define
img
Then U and V are independent.

Solution: Indeed, it is easy to see that (U, V) is a Gaussian vector (every linear combination of its components is a linear combination of X and Y and (X, Y) is a Gaussian vector). Moreover,

img

and the independence is obtained from Theorem 10.12. img


Example 10.3
Let X1, X2, X3, X4 be four independent standard normal random variables. Define
img
We will show that (Y1, Y2, Y3) is a Gaussian vector with independent components.

Solution: Let us first show that Y is a Gaussian vector. We note that

img

is a Gaussian vector with zero mean and covariance matrix I4. Take img. Then

img

and this is a Gaussian r.v. since X is a Gaussian vector. Moreover,

img

In the same way, we have

img

so Y1, Y2, Y3 are independent random variables. In fact, the vector Y is a Gaussian vector with EY = 0 and covariance matrix

img

img


Example 10.4
Let (X, Y) be a Gaussian couple with mean the zero vector such that img and img. Find a scalar α such that XαY and Y are independent random variables.

Solution: Since XαY and Y are Gaussian random variables, it suffices to impose the condition

img

this implies

img

where we denoted with

img

the correlation coefficient between X and Y. img


Example 10.5 Two Uncorrelated Normals Which are not Independent
Let XN(0, 1) and let ε be a random variable such that
img
Suppose that X and ε are independent and define
img
Then, both of these random variables (X and Y) are normal, they are uncorrelated but they are not independent.

Solution: Let us show first that

img

Indeed, by computing the cumulative distribution function of Y, we get for every img

img

by using the fact that −XN(0, 1). So X, Y have the same law N(0, 1). Let us show that X, Y are uncorrelated. Indeed,

img

since Eε = 0. But it is easy to see that X and Y are not independent.

This example shows that it is possible to find two Gaussian uncorrelated r.v which are not independent. The reason why they are not independent is that the vector (X, Y) is not a Gaussian vector (see exercise 7.8). img

A more general result can be stated as follows. The proof follows the arguments in the proof of Theorem 10.12.


Theorem 10.13 Suppose XN(0, ΛX) with img. Suppose that the components of X can be divided into two groups (Xi)iimgI and (Xj)jI, where I ⊂ {1, 2, …, n}, and further suppose that
img
for all i img I and jI. Then the family (Xi)iimgI is independent of the family (Xj)jI.


Remark 10.14 A Gaussian vector X is called degenerated if the covariance matrix ΛX is not invertible (i.e., det ΛX = 0). For example, the vector
img
with X1 a Gaussian r.v., is a degenerated Gaussian vector.

10.3.4 The Density of a Gaussian Vector

Let X = (X1, …, Xd) be a Gaussian vector with independent components. Assume that for every i = 1, …, d we have

img

In this case it is easy to write the density of the vector X. It is, from Corollary 7.23,

(10.3) equation

In the case of a standard normal vector XN(0, Id), we have

img

When the components of X are not independent, we have the following.


Theorem 10.15 Let X = (X1, …, Xd) be a Gaussian vector with μ = EX and covariance matrix denoted by ΛX. Assume that ΛX is invertible. Then X admits the following probability density function:
for any img.

Proof: From Theorem 10.7 we can write

img

where

img

We apply the change of variable formula (Theorem 7.24) to the function img:

img

We have

img

We obtain

img

where fN denotes the density of the vector NN(0, Id). Since

img

and

img

we obtain the conclusion of the theorem. img


Remark 10.16 The above result applies only to non-degenerated Gaussian vectors. Indeed, if det ΛX = 0, the formula (10.4) has no sense. In fact, if X is a degenerated Gaussian vector, then X does not have a probability density function.

However, in the case of degenerated Gaussian vectors, we have the property stated by the next proposition. We need to recall the notion of rank of a matrix. The rank of a matrix is the dimension of the vector space generated by its columns (or its rows). It is the largest possible dimension of a minor with determinant different from zero. A minor in a matrix can be constructed by eliminating any number of rows and columns. Obviously, if the d-dimensional matrix is invertible, then the rank of the matrix is d.


Proposition 10.17 Let XN(μ, ΛX) in img and assume that
img
a degenerated random vector.
Then, there exists a vector space img of dimension dk such that imga, X img = aTX is a constant random variable for every a img H.


Example 10.6
Let X = (X1, X1) with X1 Gaussian one-dimensional. Then clearly
img
Let
img
Then H is a subspace of img and imgC, X img = CTX = 0 for every C img H.


Example 10.7
Let X be a two-dimensional Gaussian vector with zero mean and covariance matrix
img
Let us write the density of the vector X. First detΛX = 15 and
img
Therefore
img


Remark 10.18
1. As we mentioned, we can see that μ and ΛX completely characterize the law of a Gaussian vector. This is not true for other types of random vectors.
2. It is easy to see that in the case when the components of X are independent, formula (10.4) reduces to (10.3).

In the case of a Gaussian vector of dimension 2 (a Gaussian couple), we have the following:


Proposition 10.19 Let X = (X1, X2) be a Gaussian couple with
img
and
img
where we denoted the correlation,
img

Assume ρ2 ≠ 1. Then the density of the vector X is

img

Proof: This follows from Theorem 10.4 since

img

img

10.3.5 Cochran's Theorem

Recall that if XN(0, 1), then img, the gamma distribution with parameters img and img. This law is called the chi square distribution and is usually denoted by χ2(1), with 1 denoting one degree of freedom. More generally, if X1, …, Xd are independent standard normal random variables, then

img

follows the law img and this is called the chi square distribution with d degrees of freedom, denoted by χ2(d).

This situation can be extended to nonstandard normal random variables.


Definition 10.20 If X = (X1, …, Xd) is a Gaussian vector with EX = μ and ΛX = Id, then the law of ||X||2 is denoted by
img
and it is called the noncentral chi square distribution with d degrees of freedom and noncentrality parameter μ. When ||μ|| = 0, then obviously χ2(d, 0) = χ2(d) .

Recall that the modulus (or the Euclidean norm) of a d-dimensional vector is

img


Remark 10.21 The law of the random variable ||X||2 depends only on ||μ|| and d. Indeed, if img is such that
img
with ||μ′|| = ||μ||, then there exists an orthonormal matrix U such that
img
Therefore,
img
and
img


Remark 10.22 We know that the density of the χ2(d) law is
img
In the case of the noncentral chi square distribution χ2(d, a) we can prove that the density is
where Yq denotes a random variable with distribution χ2(q).

We can now state the Cochran theorem.


Theorem 10.23 (Cochran's theorem) Assume
img
where Ei, i = 1, …, r are orthogonal subspaces of img with dimension d1, …, dr respectively. Denote by
img
the orthogonal projection of X on the subspace Ei.
Then img are independent random vectors, img are also independent, and
img
where img is the projection of μ on Ei for every i = 1, …, r.

Proof: Let

img

be an orthonormal basis of Ej. Then

img

The random vectors img are independent of distribution img, so the random vectors img are independent. To finish, it suffices to remark that

img

for every j = 1, …, d. img

Let us give an important application of the Cochran's theorem to Gaussian random vectors.


Proposition 10.24 Let X = (X1, …, Xn) denote a Gaussian vector with independent identically distributed N(μ, σ2) components. Let us define
img
and
img
the sample mean and sample variance respectively.
Then
1. img and img are independent.
2. img.
3. img.

Proof: We set for every i = 1, …, n

img

Then Yi are independent identically distributed N(0, 1). We also set

img

(by Vect(e) we mean the vector space of img generated by the vector e). Then

img

The projections of Y = (Y1, …, Yn) on E, E are independent and given by

img

and

img

We therefore have

img

and

img

which gives the conclusion. img

10.3.6 Matrix Diagonalization and Gaussian Vectors

We start with some notion concerning eigenvalues and eigenvectors of a matrix.


Definition 10.25 Let A be a matrix in img. We say that λ is an eigenvalue of the matrix A if there exists a vector img, u ≠ 0 such that
img
In this case we will say that u is an eigenvector of the matrix A.


Remark 10.26 For every vector img we have Inu = u (where we denoted with In the identity matrix in img). This implies that every vector img is an eigenvector of the identity matrix In associated to the eigenvalue λ = 1.
If D = Diag(D1, …, Dn) is a diagonal matrix, then every Di, i = 1, …, n, is an eigenvalue of the matrix D and every vector ei = (0, 0, …, 1, …, 0) of the canonical basis of img is an eigenvector of D associated to the eigenvalue Di.


Proposition 10.27 If λ is an eigenvalue of the matrix img, then the set Eλ of the eigenvectors of A associated to λ is a vectorial subspace of img.

Proof: Let λ be an eigenvalue of A. If u img Eλ is a corresponding eigenvector, then for every img, α ≠ 0 we have

img

so

(10.6) equation

which shows that Eλ is closed under multiplication with scalars.

If img is another vector in Eλ, then

img

so

(10.7) equation

Relations (10.6) and (10.7) show that Eλ is a vector space in img. img


Definition 10.28 The vector space Eλ is called the eigenspace associated with the eigenvalue λ.


Definition 10.29 If img is an n-dimensional matrix, we define
img
the kernel or the null space of the matrix A.


Proposition 10.30 Let img. Then the eigenvalues of A are solutions of the equation
img
and the eigenvectors associated with λ are the elements of Ker(AλIn).

Proof: If λ is an eigenvalue for A, there exists u ≠ 0, img such that Au = λu; therefore

img

This implies that the matrix AλIn is not invertible and thus det (AλIn) = 0 .

Conversely, if det (AλIn) = 0, then AλIn is not invertible so there exists img nonidentically zero such that (AλIn)u = 0 .

Finally, if λ is an eigenvalue of A, then the set of associated eigenvectors are the vectors img satisfying (AλIn)u = 0, which in fact is the definition of Ker(AλIn). img


Example 10.8
Let
img
then
img
and det (AλI2) = λ2 − 3λ − 4 . The solution of the equation det (AλI2) = (which are the eigenvalues of A) is
img
Further,
img
and
img


Definition 10.31 A matrix img is diagonalisable if there exists a matrix img invertible such that
img
where D is a diagonal matrix.

In the case when A is diagonalizable, every column of the matrix P represents an eigenvector for A and the diagonal matrix D contains on its diagonal the eigenvalues of A. Each column i is an eigenvector for the eigenvalue i on the diagonal of D.

The following results apply to any random vector and not only the Gaussian random vectors. We shall review the requirement of a covariance matrix.


Definition 10.32 A matrix A is called symmetric iff it is equal to its transpose A = AT or element-wise aij = aji for all i, j. Note that from definition a symmetric matrix needs to be a square matrix (number of columns equal to the number of rows).
A d × d-dimensional matrix A is called positive definite if and only if
img
for any img, u ≠ 0.
The matrix A is called non-negative definite if and only if
img

Please note that uTAu is a number, thus its sign is unique. Further, we always consider vectors in img as matrices having dimension d × 1.


Proposition 10.33 Let img be a random vector with mean μ and covariance matrix ΛX. Then the matrix ΛX is symmetric and non-negative definite.

Proof: To prove this proposition, let us first remark that the covariance matrix is a square matrix. Next, the element on row i column j is

img

thus the matrix must be symmetric.

About the positive definiteness for any img, we can construct the one-dimensional random variable: uTX. Since this is a valid random variable, its variance must be non-negative. So let us calculate this variance:

img

Since the number squared is one-dimensional, and a one-dimensional number is equal to its transpose, we may write

img

Thus, the condition that variance is non-negative translates into the condition that the covariance matrix is non-negative definite. img


Remark 10.34 The distinction between non-negative and positive definite matrices is important for random variables. If it is possible to create a random variable which is identically zero, then its variance will be zero. If the components are independent, then the covariance matrix will be positive definite.
As a simple example, consider the random vector X = (X1, − X1) for some random variable X1. The covariance matrix of the vector will be non-negative definite since there exists the vector img and uTX = 0 and thus its variance will be zero.

Checking that a square matrix is positive definite can be complicated. However, there is an easy way to check involving the eigenvalues of the matrix.


Lemma 10.35 A matrix img is positive definite if all its eigenvalues are real and positive. A matrix is non-negative definite if all eigenvalues are non-negative.

The following result is important because it can be applied to the covariance matrices.


Theorem 10.36 Let A be a symmetric, positive definite n × n matrix. Then A is diagonalizable by an orthonormal matrix P. That is, there exists an orthonormal matrix P (i.e., PT = P−1) such that
img
with img a diagonal matrix.

The above result says that every symmetric positive definite matrix is diagonalizable in an orthonormal basis. That is, it can be transformed by elementary transformations into a diagonal matrix.


Definition 10.37 Two vectors x = (x1, …, xn) and y = (y1, …, yn) in img are called orthogonal if
img
A vector is called with norm 1 if img.
The vectors x and y are called orthonormal if they are orthogonal and ||x|| = ||y|| = 1.


Theorem 10.38 Let X be a d-dimensional Gaussian vector with zero mean and covariance matrix ΛX. Then there exists a matrix img such that BX is a Gaussian vector with independent components.

Proof: The proof follows by using Theorem 10.3.36. Since ΛX is a symmetric matrix, it can be diagonalizable in an orthonormal basis. That is, there exists an orthogonal matrix B such that BΛXBT is diagonal. The covariance of BX is BΛXBT, so the components of BX are independent Gaussian random variables. img


Example 10.9
Let X be a centered Gaussian vector with
img
Note first that this matrix is symmetric. To be a valid covariance matrix, it needs to be positive definite, which can be checked by looking at the sign of the eigenvalues (all eigenvalues should be positive) so it can diagonalizable. Then two eigenvalues are the solution of the equation
img
and thus
img
The matrix therefore is positive definite and symmetric. To find the eigenvectors, we need to solve using the definition ΛXu = λiu for both i = 1 and i = 2. This gives
img
and
img
The eigenspace img is generated by the vector (1, − 1), while img is generated by the vector (1, 1). This two vectors are orthogonal but not orthonormal (their euclidian norm is not 1). To make them ortonormal, we normalize them. We define
img
and
img
and therefore the matrix P given by
img
Then it can be checked that
img
where D is the diagonal matrix with the eigenvalues of ΛX on the diagonal
img


Remark 10.39 If the matrix ΛX is small, then the decomposition shown above will work. However, if the dimensionality of the matrix is large while the methodology presented here will still work, it will become very tedious. In this case, one will enroll the use of a computer and reach the same decomposition using the methodology presented in Section 6.4.

Exercises

Problems with Solution

10.1 Suppose

img

and define

img

a. Find img such that Y = (Y1, Y2) is a Gaussian vector with independent components.
b. Write the density of the vector Y.

Solution: Y is clearly a Gaussian vector since Y = AX with

img

where each component of Y is a linear combination of the original independent components Gaussian vector. To have the components of Y independent we need to impose the condition

img

But

img

Therefore if a + b = 0, the components Y1, Y2 are independent. Since

img

we will have in this case (a + b = 0)

img

img

10.2 Let X = (X1, X2, X3) be a random vector in img with density

img

a. Find the law of X. Derive k which makes the law a proper density function.
b. Let

img

Find the parameters a, b, c, d, e such that the covariance matrix of img is I3.
c. What can be said about the variables X1, img, and img ?

Solution: Note that

img

where

img

To see this decomposition, please think about how the polynomial terms appear. It helps to note that the diagonal elements give the squares in a unique way and thus they are easy to recognize. For the off diagonal elements note that twice the element gives the coefficient (because the matrix is symmetric).

Once we write it in this form, we recognize the density of aGaussian vector X with zero mean and covariance matrix

img

Consequently,

img

To solve part (b), we need to impose that

img

and

img

To impose these conditions, we need to calculate the covariance matrix:

img

Thus we need to know how to invert a matrix or to use a software program to do so. Using R gives

img

The command “solve(M)” is the R command to find the inverse of the matrix M. We can now read the covariances between the original vector components X1, X2, X3.

Now, using the matrix above and the formulas for the vector, the conditions are

img

V (X1) is already 1. Using a = b and c = d from the first two equations, the later equations become

img

The first two equations are incompatible with a = 0, so we must have c = −e and either a = 1 or a = −1. Using this in the last equation gives e = 1. Thus the problem has more than one solution. Either of these vectors

img

will have the desired properties.

Finally, for part (c), since either one of these vectors is Gaussian by requiring that the covariance matrix is the identity, we found components which are mutually independent. img

10.3 Let X, X, Z be independent standard normal random variables. Denote

img

and

(10.8) equation

Show that U and V are independent.

Solution: Define

img

Clearly, A is a Gaussian vector with img and covariance matrix

img

the identity matrix. It follows that the vector

img

is also a Gaussian vector (every linear combination of its components is a linear combination of X, Y, Z). We will show that the first component is independent of all the other three. To prove this since the vector is Gaussian, it suffices to show that the first component is uncorrelated with the other three. We have

img

and in a similar way

img

Therefore the r.v. X + Y + Z is independent of XY, XZ, and YZ respectively. By the associativity property of the independence, we have that X + Y + Z is independent of (XY, XZ, YZ) and, thus, independent of V. img

10.4 Let X1, …, Xn be independent N(0, 1) distributed random variables. Let img. Give a necessary and sufficient condition on the vector a in order to have Ximg a, X img a and imga, Ximg independent.

Solution: The vector

img

is an (n + 1)-dimensional Gaussian vector and for every i = 1, …, n we obtain

img

Therefore if we impose the condition

img

then all covariances between imga, Ximg and the other terms of the vector will be zero. This will accomplish what is needed in the problem. img

10.5 Let X = (X1, X2, …, Xn) denote an n-dimensional random vector with independent components such that img for every i = 1, …, n. Define

img

a. Give the law of img
b. Let a1, …, an in img. Give a necessary and sufficient condition (in terms of a1, …, an) such that img and a1X1 + img + anXn are independent.
c. Deduce that the vector img is independent of img.

Solution: As a sum of independent normal random variables, img is a normal random variable. Its parameters can be easily calculated as mean img and variance img. So

img

Note that the vector

img

is a Gaussian random vector. Indeed, every linear combination of its components is a linear combination of the components of X, so it is a Gaussian random variable. Therefore, img and img are independent if and only if they are uncorrelated; and after calculating the covariance, this is equivalent to

img

Hint for part (c): Wn is invariant by translation: Wn(X) = Wn(X + a) if X + a = (X1 + a, . . ., Xn + a) for img. Consider also Proposition 10.24. img

10.6 Let (X, Y) be a Gaussian vector with mean 0 and covariance matrix

img

with ρ img [− 1, 1]. What can be said about the random variables

img

Solution: Clearly, (X, Z) is a Gaussian vector as a linear transformation of a Gaussian vector. Since

img

we note that the r.v.'s X and Z are independent. img

10.7 Suppose XN(0, 1). Prove that for every x > 0 the following inequalities hold:

img

Solution: Consider the following functions defined on (0, ∞):

img

and

img

We need to show that

img

for every x > 0, where F is the c.d.f. of an N(0, 1) distributed random variable.

Since the normal c.d.f. does not have a closed form, we look at the derivatives of these functions. We need to check that for x > 0:

img

where img is the standard normal density. Therefore, integrating the respective positive functions on (0, ∞), we obtain

img

and

img

img

Problems without Solution

10.8 Suppose

img

Show that XY has the same law as img

Hint: Use the polarization formula

img

10.9 Prove the expression of the density function of the noncentral chi square distribution (10.5).

10.10 Let (X, Y) be a two-dimensional Gaussian vector with zero expectation and I2 covariance matrix. Compute

img

10.11 Let X, Y be two independent N(0, 1) distributed random variables. Define

img

a. Prove that U and V are independent.
b. Show that img.
c. Show that the function

img

defines a probability density.
d. Show that V admits as density the function f above (this distribution is called the arcsin distribution).

10.12 Let X1, …, Xn be i.i.d. N(0, 1) random variables. Define

img

Compare and calculate EU and EV.

10.13 Let img with

img

a. Let (Y1, Y2) a standard Gaussian vector (i.e., Y1, Y2 are independent standard Gaussian random variables). Find a function

img

such that img.
b. Let img and img two independent random variables with the respective distributions. Let R ≥ 0 be the square root of R2.
Prove that the random variables X = R cos (Θ) and Y = R sin (Θ) are standard Gaussian and independent.
c. Deduce that if U1, U2 are independent img, then the r.v.s

img

and

img

are standard Gaussian and independent.
d. Consider U1, U2 independent with law img. Construct from U1, U2 a vector (X1, X2) with law img.

10.14 Suppose XN(0, Λ) where

img

Let

img

a. Give the law of Yi, i = 1, 2, 3.
b. Write down the density of the vector

img

10.15 Let X = (X1, X2, X3) be a Gaussian vector with law N3(m, C) with density

img

where

img

a. Calculate m and C−1.
b. Calculate C and the marginal distributions of X1, X2, X3.
c. Give the law of

img

10.16 Let (X, Y) be a normal random vector such that X and Y are standard normal N(0, 1). Suppose that Cov(X, Y) = ρ. Let img and put

img

a. Show that |ρ| ≤ 1.
b. Calculate E(U), E(V), Var(U), Var(V), and Cov(U, V). What can we say about the vector (U, V) ?
Suppose ρ ≠ 0. Do there exist values of θ such that U and V are independent?
c. Assume ρ = 0. Give the laws of U and V?
d. Are the r.v. U and V independent?

10.17 Let (X, Y, Z) be a Gaussian vector with mean (1, 2, 3) and covariance matrix

img

Set

img

a. Give the law of the couple img. What can be said about U and V? Write without calculation the density of (U, V).
b. Find constants c and d such that the r.v. W = Z + cU + dV is independent by (U, V).
c. Write the covariance matrix of the vector (U, V, W).

10.18 If X is a standard normal random variable N(0, 1), let ϕ denote its characteristic function and F its c.d.f. For every p ≥ 1 integer, denote the p moment with

img

Let (Yn , n ≥ 1) be a sequence of independent r.v. with identical distribution N(0, 1). For every k ≥ 1 and n ≥ 1 integers, let

img

a. Give the distribution of Xk.
b. Calculate img. Are the variables (Xk, k ≥ 1) independent?
c. Show that img. Deduce that for every integer n ≥ 1 the r.v. Sn follows the law

img

Hint: We recall the formula

img

d. For every integer p ≥ 1, calculate E(|Sn|p) in terms of img.
e. Let img. Show that the sequence (nα Sn , n ≥ 1) converges to 0 in L2, that is,

img

as a sequence of numbers.
f. Show that for every β > 0 and for every p ≥ 1 integer, there exists a constant Kp > 0 such that

img

g. Show that the sequence (nα Sn , n ≥ 1) converges almost surely to an r.v. and identify this limit.
h. Calculate the characteristic function img of Sn using ϕ.
i. Let

img

Deduce the expression of the characteristic function of Tn.
j. Show that the sequence (Tn, n ≥ 1) converges in law to a limit and identify this limit.

10.19 Let X1, X2, and X3 be three i.i.d. random variables where their distribution has zero mean and variance σ2 > 0. Denote

img

a. Calculate the covariance matrix of the vector (X1, Y1, Y2).
b. Give an upper bound for

img

using Bienaymé–Tchebychev inequality in Appendix B.
c. Suppose that X1, X2, and X3 are Gaussian. In this case, give another upper bound for a and compare with the previous question.
d. Give an upper bound for

img

Hint: Use img and img.

10.20 If XN(0, 1) and Yχ2(n, 1) and X, Y are independent, show that

img

has a Student t distribution (tn) with n degrees of freedom. Specifically, show that the probability density function of Z is given by

img

10.21 Assume X and Y are two independent N(0, 1) random variables. Find the law of

img

Hint: We know (how to prove) that X + Y and XY are independent and each has N(0, 2) distribution. Then, once we show that

img

is χ2(1) distributed, we will obtain

img

which follows a Student distribution with one degree of freedom (see exercise 20).

10.22 Consider the matrix

img

a. Find the eigenvalues of A.
b. Find the eigenspaces associated with each eigenvalue.
c. Diagonalize the matrix A.
d. Let X be a Gaussian random vector with covariance matrix A and zero mean. Find a linear transformation that transforms X in a Gaussian vector with independent components.

10.23 Consider the matrix

img

a. Check that λ1 = 8 is an eigenvalue of A.
b. Find the other eigenvalues of A.
c. Find the eigenspaces associated with each eigenvalue.
d. Diagonalize the matrix A.
e. Can there exist a Gaussian random vector with covariance matrix A?

10.24 Let the matrix Σ be defined as

img

Check that the matrix is symmetric and positive definite. Find its eigenvalues and the associated eigenvectors. Now, let X be a Gaussian random vector with covariance matrix Σ and zero mean. Find a linear transformation that transforms X in a Gaussian vector with independent components.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.168.203