Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8
Some important differentials

1 INTRODUCTION

Now that we know what differentials are and have adopted a convenient and simple notation for them our next step is to determine the differentials of some important functions. We shall discuss the differentials of some scalar functions of X (eigenvalue, determinant), a vector function of X (eigenvector), and some matrix functions of X (inverse, Moore‐Penrose (MP) inverse, adjoint matrix).

But first, we must list the basic rules.

2 FUNDAMENTAL RULES OF DIFFERENTIAL CALCULUS

The following rules are easily verified. If u and v are real‐valued differentiable functions and α is a real constant, then we have

(1)

(2)

(3)

and also

(4)

(5)

The differential of the power function is

(6)

where the domain of definition of the power function u^α depends on the arithmetical nature of α. If α is a positive integer, then u^α is defined for all real u; but if α is a negative integer or zero, the point u = 0 must be excluded. If α is a rational fraction, e.g. α = p/q (where p and q are integers and we can always assume that q > 0), then , so that the function is determined for all values of u when q is odd, and only for u ≥ 0 when q is even. In cases where α is irrational, the function is defined for u > 0.

The differentials of the logarithmic and exponential functions are

(7)

(8)

(9)

Similar results hold when U and V are matrix functions, and A is a matrix of real constants:

(10)

(11)

(12)

(13)

For the Kronecker product and Hadamard product, the analog of (13) holds:

(14)

(15)

Finally, we have

(16)

(17)

(18)

For example, to prove (3), let ϕ(x) = u(x) + v(x). Then,

As a second example, let us prove ( 13 ). Using only ( 3 ) and (4), we have

Exercises

1. Prove Equation (14).
2. Show that d(UVW) = (dU)VW + U(dV)W + UV(dW).
3. Show that d(AXB) = A(dX)B, where A and B are constant.
4. Show that d tr X′X = 2 tr X′dX.
5. Let u : S → ℝ be a real‐valued function defined on an open subset S of ℝⁿ. If u′u = 1 on S, then u′du = 0 on S.

3 THE DIFFERENTIAL OF A DETERMINANT

Let us now apply these rules to obtain a number of useful results. The first of these is the differential of the determinant.

It is worth stressing that at points where r(F(X)) = m − 1, F(X) must have at least one zero eigenvalue. At points where F(X) has a simple zero eigenvalue (and where, consequently, r(F(X)) = m − 1), (20) simplifies to

where μ(F(X)) is the product of the m − 1 nonzero eigenvalues of F(X).

In the previous proof, we needed element‐by‐element differentiation, something that we would rather avoid. Hence, let us provide a second proof, which is closer to the spirit of matrix differentiation.

Second proof of Theorem 8.1

Denote the partial derivative of |Y| with respect to its ijth element y_ij by a_ij(Y), and define the matrix A(Y) whose ijth element is a_ij(Y). Then, by definition,

(22)

Let X be a given nonsingular matrix and define Z = X⁻¹Y. Writing

we obtain

(23)

It follows from (22) and (23) that

and hence in particular that

To complete the proof, we show that A(I) = I. This follows from the fact that

Since X⁻¹ = |X|⁻¹ X^#, where X^# denotes the adjoint matrix, we can also write the result as

(24)

The result in Equation (24) holds also for singular X, because the set of invertible matrices forms a dense subset of the set of all n × n matrices, and both sides of ( 24 ) are polynomial functions, hence continuous. (A subset E of S is dense in S if for any point x ∈ S, any neighborhood of x contains at least one point from E, i.e. E has nonempty intersection with every nonempty open subset of S.)

We do not, at this point, derive the second‐ and higher‐order differentials of the determinant function. In Section 8.4 (Exercises 1 and 2), we obtain the differentials of log |F| assuming that F(X) is nonsingular. To obtain the general result, we need the differential of the adjoint matrix. A formula for the first differential of the adjoint matrix will be obtained in Section 8.6.

Result ( 19 ), the case where F(X) is nonsingular, is of great practical interest. At points where |F(X)| is positive, its logarithm exists and we arrive at the following theorem.

Exercises

1. Give an intuitive explanation of the fact that d|X| = 0 at points X ∈ ℝ^n × n, where r(X) ≤ n − 2.
2. Show that, if F(X) ∈ ℝ^m × m and r(F(X)) = m − 1 for every X in some neighborhood of X₀, then d|F(X)| = 0 at X₀.
3. Show that d log |X′X| = 2 tr(X′X)⁻¹ X′dX at every point where X has full column rank.

4 THE DIFFERENTIAL OF AN INVERSE

The next theorem deals with the differential of the inverse function.

Let us consider the set T of nonsingular real m × m matrices. T is an open subset of ℝ^m × m, so that for every Y₀ ∈ T, there exists an open neighborhood N(Y₀) all of whose points are nonsingular. This follows from the continuity of the determinant function |Y|. Put differently, if Y₀ is nonsingular and {E_j} is a sequence of real m × m matrices such that E_j → 0 as j → ∞, then

(26)

for every j greater than some fixed j₀, and

Exercises

1. Let T₊ = {Y : Y ∈ ℝ^m × m, ∣ Y ∣ > 0}. If F : S → T₊, S ⊂ ℝ^n × q, is twice differentiable on S, then show that
d² log ∣ F ∣ = − tr(F⁻¹dF)² + tr F⁻¹d²F.
2. Show that, for X ∈ T₊, log |X| is ∞ times differentiable on T₊, and
3. Let ϕ(ɛ) = log |I + ɛA|. Show that the rth derivative of ϕ evaluated at ɛ = 0 is given by
4. Hence, we have the following approximation for small ɛ:

(Compare Exercises 2 and 3 in Section 1.13.)
5. Let T = {Y : Y ∈ ℝ^m × m, ∣ Y ∣ ≠ 0}. If F : S → T, S ⊂ ℝ^n × q, is twice differentiable on S, then show
d²F⁻¹ = 2[F⁻¹(dF)]²F⁻¹ − F⁻¹(d²F)F⁻¹.
6. Show that, for X ∈ T, X⁻¹ is ∞ times differentiable on T, and

5 DIFFERENTIAL OF THE MOORE‐PENROSE INVERSE

Equation (26) above and Exercise 1 in Section 5.15 tell us that nonsingular matrices have locally constant rank. Singular matrices (more precisely matrices of less than full row and column rank) do not share this property. Consider, for example, the matrices

and let Y = Y(j) = Y₀ + E_j. Then r(Y₀) = 1, but r(Y) = 2 for all j. Moreover, Y → Y₀ as j → ∞, but

does not converge to , because it does not converge to anything. It follows that (i) r(Y) is not constant in any neighborhood of Y₀ and (ii) Y⁺ is not continuous at Y₀. The following proposition shows that the conjoint occurrence of (i) and (ii) is typical.

We now have all the ingredients for the main result.

Exercises

1. Prove (28).
2. If F(X) is idempotent for every X in some neighborhood of a point X₀, then F is said to be locally idempotent at X₀. Show that F(dF)F = 0 at points where F is differentiable and locally idempotent.
3. If F is locally idempotent at X₀ and continuous in a neighborhood of X₀, then tr F is differentiable at X₀ with d(tr F)(X₀) = 0.
4. If F has locally constant rank at X₀ and is continuous in a neighborhood of X₀, then tr F⁺ F and tr FF⁺ are differentiable at X₀ with d(tr F⁺F)(X₀) = d(tr FF⁺)(X₀) = 0.
5. If F has locally constant rank at X₀ and is differentiable in a neighborhood of X₀, then tr FdF⁺ = −trF⁺dF.

6 THE DIFFERENTIAL OF THE ADJOINT MATRIX

If Y is a real m × m matrix, then by Y^# we denote the m × m adjoint matrix of Y. Given an m × m matrix function F, we now define an m × m matrix function F^# by F^#(X) = (F(X))^#. The purpose of this section is to find the differential of F^#. We first prove Theorem 8.6.

Recall from Theorem 3.2 that if Y = F(X) is an m × m matrix and m ≥ 2, then the rank of Y^# = F^#(X) is given by

(34)

As a result, two special cases of Theorem 8.6 can be proved. The first relates to the situation where F(X₀) is nonsingular.

The second special case of Theorem 8.6 concerns points where the rank of F(X₀) does not exceed m − 3.

There is another, more illuminating, proof of Corollary 8.2 — one which does not depend on Theorem 8.6. Let Y₀ ∈ ℝ^m × m and assume Y₀ is singular. Then r(Y) is not locally constant at Y₀. In fact, if r(Y₀) = r (1 ≤ r ≤ m − 1) and we perturb one element of Y₀, then the rank of (the perturbed matrix) will be r − 1, r, or r + 1. An immediate consequence of this simple observation is that if r(Y₀) does not exceed m − 3, then will not exceed m − 2. But this means that at points Y₀ with r(Y₀) ≤ m − 3,

implying that the differential of Y^# at Y₀ must be the null matrix.

These two corollaries provide expressions for dF^# at every point X where r(F(X)) = m or r(F(X)) ≤ m − 3. The remaining points to consider are those where r(F(X)) is either m − 1 or m − 2. At such points we must unfortunately use Theorem 8.6, which holds irrespective of rank considerations.

Only if we know that the rank of F(X) is locally constant can we say more. If r(F(X)) = m − 2 for every X in some neighborhood N(X₀) of X₀, then F^#(X) vanishes in that neighborhood, and hence (dF^#)(X) = 0 for every X ∈ N(X₀). More complicated is the situation where r(F(X)) = m − 1 in some neighborhood of X₀. A discussion of this case is postponed to Miscellaneous Exercise 7 at the end of this chapter.

Exercise

1. The matrix function F : ℝ^n × n → ℝ^n × n defined by F(X) = X^# is ∞ times differentiable on ℝ^n × n, and (d^jF)(X) = 0 for every j ≤ n − 2 − r(X).

7 ON DIFFERENTIATING EIGENVALUES AND EIGENVECTORS

There are two problems involved in differentiating eigenvalues and eigenvectors. The first problem is that the eigenvalues of a real matrix A need not, in general, be real numbers — they may be complex. The second problem is the possible occurrence of multiple eigenvalues.

To appreciate the first point, consider the real 2 × 2 matrix function

The matrix A is not symmetric, and its eigenvalues are 1 ± iɛ. Since both eigenvalues are complex, the corresponding eigenvectors must be complex as well; in fact, they can be chosen as

We know, however, from Theorem 1.4 that if A is a symmetric matrix, then its eigenvalues are real and its eigenvectors can always be taken to be real. Since the derivations in the symmetric case are somewhat simpler, we shall only consider the symmetric case.

Thus, let X₀ be a symmetric n × n matrix, and let u₀ be a (normalized) eigenvector associated with an eigenvalue λ₀ of X₀, so that the triple (X₀, u₀, λ₀) satisfies the equations

(35)

Since the n + 1 equations in (35) are implicit relations rather than explicit functions, we must first show that there exist explicit unique functions λ = λ(X) and u = u(X) satisfying ( 35 ) in a neighborhood of X₀ and such that λ(X₀) = λ₀ and u(X₀) = u₀. Here, the second (and more serious) problem arises: the possible occurrence of multiple eigenvalues.

We shall see that the implicit function theorem (given in the Appendix to Chapter 7) implies the existence of a neighborhood N(X₀) ⊂ ℝ^n × n of X₀, We shall see that the implicit function theorem (given in the Appendix to Chapter 7) implies the existence of a neighborhood N(X₀) ⊂ ℝ^n × n of X₀, where the functions λ and u both exist and are ∞ times (continuously) differentiable, provided λ₀ is a simple eigenvalue of X₀. If, however, λ₀ is a multiple eigenvalue of X₀, then the conditions of the implicit function theorem are not satisfied. The difficulty is illustrated by the real 2 × 2 matrix function

The matrix A is symmetric for every value of ɛ and δ; its eigenvalues are and . Both eigenvalue functions are continuous in ɛ and δ, but clearly not differentiable at (0, 0). (Strictly speaking, we should also prove that λ₁ and λ₂ are the only two continuous eigenvalue functions.) The conical surface formed by the eigenvalues of A(ɛ, δ) has a singularity at ɛ = δ = 0 (Figure 8.1). For a fixed ratio ɛ/δ however, we can pass from one side of the surface to the other going through (0, 0) without noticing the singularity. This phenomenon is quite general and it indicates the need to restrict our study of differentiability of multiple eigenvalues to one‐dimensional perturbations only.

Plot depicting a conical surface with vertex at 1 on the vertical axis and diagonals plotted. — Figure 8.1 The eigenvalue functions

images — Figure 8.1 The eigenvalue functions

8 THE CONTINUITY OF EIGENPROJECTIONS

When employing arguments that require limits such as continuity or consistency, some care is required when dealing with eigenvectors and associated concepts.

We shall confine ourselves to an n × n symmetric (hence real) matrix, say A. If Ax = λx for some x ≠ 0, then λ is an eigenvalue of A and x is an eigenvector of A associated with λ. Because of the symmetry of A, all its eigenvalues are real and they are uniquely determined (Theorem 1.4). However, eigenvectors are not uniquely determined, not even when the eigenvalue is simple. Also, while the eigenvalues are typically continuous functions of the elements of the matrix, this is not necessarily so for the eigenvectors.

Some definitions are required. The set of all eigenvalues of A is called its spectrum and is denoted by σ(A). The eigenspace of A associated with λ is

The dimension of V(λ) is equal to the multiplicity of λ, say m(λ). Eigenspaces associated with distinct eigenvalues are orthogonal to each other. Because of the symmetry of A, we have the decomposition

The eigenprojection of A associated with λ of multiplicity m(λ), denoted by P(λ), is given by the symmetric idempotent matrix

where the {x_j} form any set of m orthonormal vectors in V(λ), that is, and for i ≠ j. While eigenvectors are not unique, the eigenprojection is unique because an idempotent matrix is uniquely determined by its range and null space. The spectral decomposition of A is then

If σ₀ is any subset of σ(A), then the total eigenprojection associated with the eigenvalues in σ₀ is defined as

It is clear that P(σ(A)) = I_n. Also, if σ₀ contains only one eigenvalue, say λ, then P({λ}) = P(λ). Total eigenprojections are a key concept when dealing with limits, as we shall see below.

Now consider a matrix function A(t), where A(t) is an n × n symmetric matrix for every real t. The matrix A(t) has n eigenvalues, say λ₁(t), …, λ_n(t), some of which may be equal. Suppose that A(t) is continuous at t = 0. Then, the eigenvalues are also continuous at t = 0 (Rellich 1953).

Let λ be an eigenvalue of A = A(0) of multiplicity m. Because of the continuity of the eigenvalues, we can separate the eigenvalues in two groups, say λ₁(t), …, λ_m(t) and λ_m+1(t), …, λ_n(t), where the m eigenvalues in the first group converge to λ, while the n − m eigenvalues in the second group also converge, but not to λ. The total eigenprojection P({λ₁(t), …, λ_m(t)}) is then continuous at t = 0, that is, it converges to the spectral projection P(λ) of A(0) (Kato 1976).

Kato's result does not imply that eigenvectors or eigenprojections are continuous. If all eigenvalues of A(t) are distinct at t = 0, then each eigenprojection P_j(t) is continuous at t = 0 because it coincides with the total eigenprojection for the eigenvalue λ_j(t). But if there are multiple eigenvalues at t = 0, then it may occur that the eigenprojections do not converge as t → 0, unless we assume that the matrix A(t) is (real) analytic. (A real‐valued function is real analytic if it is infinitely differentiable and can be expanded in a power series; see Section 6.14.) In fact, if A(t) is real analytic at t = 0 then the eigenvalues and the eigenprojections are also analytic at t = 0 and therefore certainly continuous (Kato 1976).

Hence, in general, eigenvalues are continuous, but eigenvectors and eigenprojections may not be. This is well illustrated by the following example.

Example 8.1 (Kato)

Consider the matrix

There is a multiple eigenvalue 0 at t = 0 and simple eigenvalues and at t ≠ 0. The associated eigenvectors are

Hence, the associated eigenprojections are

and

The matrix function A(t) is continuous (even infinitely differentiable) for all real t. This is also true for the eigenvalues. But there is no eigenvector which is continuous in the neighborhood of t = 0 and does not vanish at t = 0. Also, the eigenprojections P₁(t) and P₂(t), while continuous (even infinitely differentiable) in any interval not containing t = 0, cannot be extended to t = 0 as continuous functions.

The total eigenprojection is given by P₁(t) + P₂(t) = I₂, which is obviously continuous at t = 0, but the underlying eigenprojections P₁(t) and P₂(t) are not. The reason lies in the fact that the matrix A(t), while infinitely differentiable at t = 0, is not analytic.

This can be seen as follows. Let

and define h(t) = f(t)g(t). We know from Example 6.3 that the function f(t) is infinitely differentiable for all (real) t, but not analytic. The function g(t) is not continuous at t = 0, although it is infinitely differentiable in any interval not containing t = 0. Their product h(t) is infinitely differentiable for all (real) t (because g is bounded), but it is not analytic. We summarize the previous discussion as follows.

We complete this section with a remark on orthogonal transformations. Let B be an m × n matrix of full column rank n. Then A = B′B is positive definite and symmetric, and we can decompose

where Λ is diagonal with strictly positive elements and S is orthogonal.

Suppose that our calculations would be much simplified if A were equal to the identity matrix. We can achieve this by transforming B to a matrix C, as follows:

where T is an arbitrary orthogonal matrix. Then,

The matrix T is completely arbitrary, as long as it is orthogonal. It is tempting to choose T = I_n. This, however, implies that if B = B(t) is a continuous function of some variable t, then C = C(t) is not necessarily continuous, as is shown by the previous discussion. There is only one choice of T that leads to continuity of C, namely T = S, in which case

9 THE DIFFERENTIAL OF EIGENVALUES AND EIGENVECTORS: SYMMETRIC CASE

Let us now demonstrate the following theorem.

Proof

Consider the vector function f : ℝ^n + 1 × ℝ^n × n → ℝ^n + 1 defined by the equation

and observe that f is ∞ times differentiable on ℝ^n + 1 × ℝ^n × n. The point (u₀, λ₀; X₀) in ℝ^n + 1 × ℝ^n × n satisfies

and

(38)

We note that the determinant in (38) is nonzero if and only if the eigenvalue λ₀ is simple, in which case it takes the value of −2 times the product of the n − 1 nonzero eigenvalues of λ₀I_n − X₀ (see Theorem 3.5).

The conditions of the implicit function theorem (Theorem 7.16) thus being satisfied, there exist a neighborhood N(X₀) ⊂ ℝ^n × n of X₀, a unique realvalued function λ : N(X₀) → ℝ, and a unique (apart from its sign) vector function u : N(X₀) → ℝⁿ, such that

(a) λ and u are ∞ times differentiable on N(X₀),
(b) λ(X₀) = λ₀, u(X₀) = u₀,
(c) Xu = λu, u′u = 1 for every X ∈ N(X₀).

This completes the first part of our proof.

Let us now derive an explicit expression for dλ. From Xu = λu, we obtain

(39)

where the differentials dλ and du are defined at X₀. Premultiplying by gives

Since X₀ is symmetric, we have . Hence,

because the eigenvector u₀ is normalized by . The normalization of u is not important here; it is important however, in order to obtain an expression for du. To this we now turn. Let Y₀ = λ₀I_n − X₀ and rewrite (8) as

Premultiplying by , we obtain

because (Exercise 1 below). To complete the proof, we only need to show that

(40)

To prove (40), let

The matrix C₀ is symmetric idempotent (because ), so that r(C₀) = r(Y₀) + 1 = n. Hence, C₀ = I_n and

since because of the normalization u′u = 1. (See Exercise 5 in Section 8.2.) This shows that ( 40 ) holds and concludes the proof.

We have chosen to normalize the eigenvector u by u′u = 1, which means that u is a point on the unit ball. This is, however, not the only possibility. Another normalization,

(41)

though less common, is in some ways more appropriate. If the eigenvectors are normalized according to (41), then u is a point in the hyperplane tangent (at u₀) to the unit ball. In either case we obtain u′du = 0 at X = X₀, which is all that is needed in the proof.

We also note that, while X₀ is symmetric, the perturbations are not assumed to be symmetric. For symmetric perturbations, application of Theorem 2.2 and the chain rule immediately yields

and

where D_n is the duplication matrix (see Chapter 3).

Exercises

1. If A = A′, then Ab = 0 if and only if A⁺b = 0.
2. Consider the symmetric 2 × 2 matrix

When λ₀ = 1 show that, at X₀,
and derive the corresponding result when λ₀ = −1. Interpret these results.
3. Now consider the matrix function

Plot a graph of the two eigenvalue functions λ₁(ɛ) and λ₂(ɛ), and show that the derivative at ɛ = 0 vanishes. Also obtain this result directly from the previous exercise.
4. Consider the symmetric matrix

Show that the eigenvalues of X₀ are 3 (twice) and 7, and prove that at X₀ the differentials of the eigenvalue and eigenvector function associated with the eigenvalue 7 are
and
where

p(X) = (x₁₂, x₂₂, x₃₂, x₁₃, x₂₃, x₃₃)^′.

10 TWO ALTERNATIVE EXPRESSIONS FOR dλ

As we have seen, the differential (36) of the eigenvalue function associated with a simple eigenvalue λ₀ of a symmetric n × n matrix X₀ can be expressed as

(42)

where u₀ is the eigenvector of X₀ associated with λ₀:

(43)

The matrix P₀ is idempotent with r(P₀) = 1.

Let us now express P₀ in two other ways: first as a product of n − 1 matrices and then as a weighted sum of the matrices .

Proof

Consider the following two matrices of order n × n

The Cayley‐Hamilton theorem (Theorem 1.10) asserts that

Further, since λ_i is a simple eigenvalue of the symmetric matrix X₀ we see from Theorem 1.20 that r(A) = n − 1. Hence, application of Theorem 3.6 shows that

(46)

where u₀ is defined in (43) and μ is an arbitrary scalar.

To determine the scalar μ, we first note from (46) that μ = tr B. We know from Theorem 1.13 that there exists an orthogonal matrix S and a diagonal matrix Λ containing λ₁, λ₂, …, λ_n on its diagonal such that

Then,

where E_ii is the n × n matrix with a one in its ith diagonal position and zeros elsewhere. It follows that

and hence

which by (42) is what we wanted to show.

Let us now prove ( 45 ). Since S′X₀S = Λ, we have

(47)

This gives

(48)

where δ_ik denotes the Kronecker delta and we use the fact that is the inner product of the ith row of V⁻¹ and the kth column of V, that is

Inserting (48) in (47) yields

where e_i is the ith elementary vector with one in its ith position and zeros elsewhere. Since λ_i is a simple eigenvalue of X₀, we have Se_i = δu₀ for some scalar δ. In fact, δ² = 1, because

Hence,

This concludes the proof, using ( 42 ).

Exercise

1. Show that the elements in the first column of V⁻¹ sum to one, and the elements in any other column of V⁻¹ sum to zero.

11 SECOND DIFFERENTIAL OF THE EIGENVALUE FUNCTION

One application of the differential of the eigenvector du is to obtain the second differential of the eigenvalue: d²λ. We consider only the case where X₀ is a symmetric matrix.

Exercises

1. Show that Equation ( 49 ) can be written as
and also as
2. Show that if λ₀ is the largest eigenvalue of X₀, then d²λ ≥ 0. Relate this to the fact that the largest eigenvalue is convex on the space of symmetric matrices. (Compare Theorem 11.5.)
3. Similarly, if λ₀ is the smallest eigenvalue of X₀, show that d²λ ≤ 0.

MISCELLANEOUS EXERCISES

1. Consider the function
where A is positive definite and X has full column rank. Show that

(For the second differential, see Miscellaneous Exercise 4 in Chapter 10.)
2. In generalizing the fundamental rule dx^k = kx^k−1 dx to matrices, show that it is not true, in general, that dX^k = kX^k−1 dX. It is true, however, that

Prove that this also holds for real k ≥ 1 when X is positive semidefinite.
3. Consider a matrix X₀ with distinct eigenvalues λ₁, λ₂, …, λ_n. From the fact that deduce that at X₀,
4. Conclude from the foregoing that at X₀,

Write this system of n equations as

Solve dλ₁. This provides an alternative proof of the second part of Theorem 8.10.
5. At points X where the eigenvalues λ₁, λ₂, …, λ_n of X are distinct, show that

In particular, at points where one of the eigenvalues is zero,
where λ_n is the (simple) zero eigenvalue.
6. Use the previous exercise and the fact that d|X| = tr X^# dX and dλ_n = v′(dX)u/v′u, where X^# is the adjoint matrix of X and Xu = X′v = 0, to show that
at points where λ_n = 0 is a simple eigenvalue. (Compare Theorem 3.3.)
7. Let F : S → ℝ^m × m(m ≥ 2) be a matrix function, defined on a set S in ℝ^n × q and differentiable at a point X₀ ∈ S. Assume that F(X) has a simple eigenvalue 0 at X₀ and in a neighborhood N(X₀) ⊂ S of X₀. (This implies that r(F(X)) = m − 1 for every X ∈ N (X₀).) Then,
where and . Show that if F(X₀) is symmetric. What is R₀ if F(X₀) is not symmetric?
8. Let F : S → ℝ^m × m(m ≥ 2) be a symmetric matrix function, defined on a set S in ℝ^n × q and differentiable at a point X₀ ∈ S. Assume that F(X) has a simple eigenvalue 0 at X₀ and in a neighborhood of X₀. Let F₀ = F(X₀). Then,
9. Define the matrix function
which is well‐defined for every square matrix X, real or complex. Show that
and in particular,

tr(d exp(X)) = tr(exp(X)(dX)).
10. Let S_n denote the set of n × n symmetric matrices whose eigenvalues are smaller than 1 in absolute value. For X in S_n, show that
11. For X in S_n, define

Show that
and in particular,

tr(d log(I_n − X)) = − tr((I_n − X)⁻¹dX).

BIBLIOGRAPHICAL NOTES

3. The second proof of the Theorem 8.1 is inspired by Brinkhuis (2015).

5. Proposition 8.1 is due to Penrose (1955, p. 408) and Proposition 8.2 is due to Hearon and Evans (1968). See also Stewart (1969). Theorem 8.5 is due to Golub and Pereyra (1973).

7. The development follows Magnus (1985). Figure 8.1 was suggested to us by Roald Ramer.

8. This section is taken, slightly adjusted and with permission from Elsevier, from De Luca, Magnus, and Peracchi (2018). The relevant literature includes Rellich (1953) and Kato (1976, especially Theorems 1.10 and 5.1). The example is due to Kato (1976, Example 5.3). Theorem 8.8 is essentially the same as Horn and Johnson (1991, Theorem 6.2.37) but with a somewhat simpler proof.

9–11. There is an extensive literature on the differentiability of eigenvalues and eigenvectors; see Lancaster (1964), Sugiura (1973), Bargmann and Nel (1974), Kalaba, Spingarn, and Tesfatsion (1980, 1981a, 1981b), Magnus (1985), Andrew, Chu, and Lancaster (1993), Kollo and Neudecker (1997b), and Dunajeva (2004) for further results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8: Some important differentials

Create new playlist

Sign In

Sign Up

1 INTRODUCTION

2 FUNDAMENTAL RULES OF DIFFERENTIAL CALCULUS

Exercises

3 THE DIFFERENTIAL OF A DETERMINANT

Exercises

4 THE DIFFERENTIAL OF AN INVERSE

Exercises

5 DIFFERENTIAL OF THE MOORE‐PENROSE INVERSE

Exercises

6 THE DIFFERENTIAL OF THE ADJOINT MATRIX

Exercise

7 ON DIFFERENTIATING EIGENVALUES AND EIGENVECTORS

8 THE CONTINUITY OF EIGENPROJECTIONS

9 THE DIFFERENTIAL OF EIGENVALUES AND EIGENVECTORS: SYMMETRIC CASE

Exercises

10 TWO ALTERNATIVE EXPRESSIONS FOR dλ

Exercise

11 SECOND DIFFERENTIAL OF THE EIGENVALUE FUNCTION

Exercises

MISCELLANEOUS EXERCISES

BIBLIOGRAPHICAL NOTES

Table of Contents for
Chapter 8: Some important differentials