5.2 Orthogonal Subspaces

Let A be an m×n matrix and let xN(A), the null space of A. Since Ax=0, we have

ai1x1+ai2x2++ainxn=0
(1)

for i=1,,m. Equation (1) says that x is orthogonal to the ith column vector of AT for i=1,,m. Since x is orthogonal to each column vector of AT, it is orthogonal to any linear combination of the column vectors of AT. So if y is any vector in the column space of AT, then xTy=0. Thus, each vector in N(A) is orthogonal to every vector in the column space of AT. When two subspaces of n have this property, we say that they are orthogonal.

Example 1

Let X be the subspace of 3 spanned by e1, and let Y be the subspace spanned by e2. If xX, these vectors must be of the form

x=[x100]andy=[0y20]

Thus,

xTy=x10+0y2+00=0

Therefore, XY.

The concept of orthogonal subspaces does not always agree with our intuitive idea of perpendicularity. For example, the floor and wall of the classroom “look” orthogonal, but the xy-plane and the yz-plane are not orthogonal subspaces. Indeed, we can think of the vectors x1=(1,1,0)T and x2=(0,1,0)T as lying in the xy- and yz-planes, respectively. Since

x1Tx2=10+11+01=1

the subspaces are not orthogonal. The next example shows that the subspace corresponding to the z-axis is orthogonal to the subspace corresponding to the xy-plane.

Example 2

Let X be the subspace of 3 spanned by e1 and e2, and let Y be the subspace spanned by e3. If xX and yY, then

xTy=x10+x20+0y3=0

Thus, XY. Furthermore, if z is any vector in 3 that is orthogonal to every vector in Y, then ze3, and hence

z3=zTe3=0

But if z3=0, then zX. Therefore, X is the set of all vectors in 3 that are orthogonal to every vector in Y (see Figure 5.2.1).

Figure 5.2.1.

A vector diagram represents orthogonality on a three dimensional plane.

Note

The subspaces X=Span(e1) and X=Span(e2) of 3 given in Example 1 are orthogonal, but they are not orthogonal complements. Indeed,

X=Span(e2,e3)andY=Span(e1,e3)

Remarks

  • 1. If X and Y are orthogonal subspaces of n, then XY={0}.

  • 2. If Y is a subspace of n, then Y is also a subspace of n.

Proof of (1)

If xXY and XY, then x2=xTx=0 and hence x=0.

Proof of (2)

If xY and α is a scalar, then for any yY,

(αx)Ty=α(xTy)=α0=0

Therefore, αxY. If x1 and x2 are elements of Y, then

(x1+x2)Ty=x1Ty+x2Ty=0+0=0

for each yY. Hence, x1+x2Y. Therefore, Y is a subspace of n.

Fundamental Subspaces

Let A be an m×n matrix. We saw in Chapter 3 that a vector bm is in the column space of A if and only if b=Ax for some xn. If we think of A as a linear transformation mapping n into m, then the column space of A is the same as the range of A. Let us denote the range of A by R(A). Thus,

R(A)={bm|b=Ax    for some xn}=the column space of A

The column space of AT,R(A)T), is a subspace of n:

R(AT)={yn|y=ATx    for some  xm}

The column space of R(AT) is essentially the same as the row space of A, except that it consists of vectors in n (n×1 matrices) rather than n-tuples. Thus, yR(AT)) if and only if yT is in the row space of A. We have seen that R(AT)N(A). The following theorem shows that N(A) is actually the orthogonal complement of R(AT).

Theorem 5.2.1 Fundamental Subspaces Theorem

If A is an m×n matrix, then N(A)=R(AT)andN(AT)=R(A).

Proof

On the one hand, we have already seen that N(A)R(AT), and this implies that N(A)R(AT). On the other hand, if x is any vector in R(AT), then x is orthogonal to each of the column vectors of AT and, consequently, Ax=0. Thus, x must be an element of N(A) and hence N(A)=R(AT). This proof does not depend on the dimensions of A. In particular, the result will also hold for the matrix B=AT. Consequently,

N(AT)=N(B)=R(BT)=R(A)

Example 3

Let

A=[1020]

The column space of A consists of all vectors of the form

[α2α]=α[12]

Note that if x is any vector in 2 and b=Ax, then

b=[1020][x1x2]=[1x12x1]=x1[12]

The null space of AT consists of all vectors of the form β(2,1). Since (1,2)T and (2,1)T are orthogonal, it follows that every vector in R(A) will be orthogonal to every vector in N(AT). The same relationship holds between R(AT) and N(A). R(AT) consists of vectors of the form αe1, and N(A) consists of all vectors of the form βe2. Since e1 and e2 are orthogonal, it follows that each vector in R(AT) is orthogonal to every vector in N(A).

Theorem 5.2.1 is one of the most important theorems in this chapter. In Section 5.3, we will see that the result N(AT)=R(A) provides a key to solving least squares problems. For the present, we will use Theorem 5.2.1 to prove the following theorem, which, in turn, will be used to establish two more important results about orthogonal subspaces.

Theorem 5.2.2

If S is a subspace of n, then S+dim S=n. Furthermore, if {x1,,xr} is a basis for S and {xr+1,,xn} is a basis for S, then {x1,,xr,xr+1,,xn} is a basis for n.

Proof

If S={0}, then S=n and

dim S+dim S=0+n=n

If S{0}, then let {x1,,xr}, be a basis for S and define X to be an r×n matrix whose ith row is xiT for each i. By construction, the matrix X has rank r and R(XT)=S. By Theorem 5.2.1,

S=R(XT)=N(X)

It follows from Theorem 3.6.5 that

dim S=dim N(X)=nr

To show that {x1,,xr,xr+1,,xn} is a basis for n, it suffices to show that the n vectors are linearly independent. Suppose that

c1x1++crxr+cr+1xr+cr+1xr+1++cnxn=0

Let y=c1x1++crxr and z=cr+1xr+1++cnxn. We then have

y+z=0y=z

Thus, y and z are both elements of SS. But SS={0}. Therefore,

c1x1++crxr=0cr+1xr+1++cnxn=0

Since x1,,xr are linearly independent,

c1=c2==cr=0

Similarly, xr+1,,xn are linearly independent and hence

cr+1=cr+2==cn=0

So x1,x2,,xn are linearly independent and form a basis for n.

Given a subspace S of n, we will use Theorem 5.2.2 to prove that each xn can be expressed uniquely as a sum y+z, where yS andzS.

Theorem 5.2.3

If S is a subspace of n, then

n=SS

Proof

The result is trivial if either S={0} or S=n. In the case where dim S=r,0<r<n, it follows from Theorem 5.2.2 that each vector xn can be represented in the form

x=c1x1++crxr+crxr+1++cnxn

where {x1,,xr} is a basis for S and {xr+1,,xn} is a basis for S. If we let

u=c1x1++crxrandv=cr+1xr+1++cnxn

then uS,vS, and x=u+v. To show uniqueness, suppose that x can also be written as a sum y+z, where yS and zS. Thus,

u+v=x=y+zuv=zv

But uyS and zvS, so each is in SS. Since

SS={0}

it follows that

u=yandv=z

Theorem 5.2.4

If S is a subspace of n, then (S)=S.

Proof

On the one hand, if xS, then x is orthogonal to each y in S. Therefore, x(S) and hence S(S). On the other hand, suppose that z is an arbitrary element of (S). By Theorem 5.2.3, we can write z as a sum u+v, where uS and vS. Since vS, it is orthogonal to both u and z. It then follows that

0=vTz=vTu+vTv=vTv

and, consequently, v=0. Therefore, z=uS and hence S=(S).

It follows from Theorem 5.2.4 that if T is the orthogonal complement of a subspace S, then S is the orthogonal complement of T, and we may say simply that S and T are orthogonal complements. In particular, it follows from Theorem 5.2.1 that N(A) and R(AT) are orthogonal complements of each other and that N(AT) and R(A) are orthogonal complements. Hence, we may write

N(A)=R(AT)andN(AT)=R(A)

Recall that the system Ax=b is consistent if and only if bR(A). Since R(A)=N(AT), we have the following result, which may be considered a corollary to Theorem 5.2.1.

Corollary 5.2.5

If A is an m×n matrix and bm, then either there is a vector xn such that Ax=b or there is a vector ym such that ATy=0 and yTb0.

Corollary 5.2.5 is illustrated in Figure 5.2.2 for the case where R(A) is a two-dimensional subspace of 3. The angle θ in the figure will be a right angle if and only if bR(A).

Figure 5.2.2.

A vector diagram has two vectors on a plane.

Example 4

Let

A=[112011134]

Find the bases for N(A), R(AT),N(AT), and R(A).

SOLUTION

We can find bases for N(A) and R(AT) by transforming A into reduced row echelon form:

[112011134][112011022][101011000]

Since (1, 0, 1) and (0, 1, 1) form a basis for the row space of A, it follows that (1,0,1)T and (0,1,1)T form a basis for R(AT). If xN(A), it follows from the reduced row echelon form of A that

x1+x3=0x2+x3=0

Thus,

x1=x2=x3

Setting x3=α, we see that N(A) consists of all vectors of the form α(1,1,1)T. Note that (1,1,1)T is orthogonal to (1,0,1)T and (0,1,1)T.

To find bases for R(A) and N(AT), transform AT to reduced row echelon form.

[101113214][101012012][101012000]

Thus, (1,0,1)T and (0,1,2)T form a basis for R(A). If xN(AT), then x1=x3,x2=2x3. Hence, N(AT) is the subspace of 3 spanned by (1,2,1)T. Note that (1,2,1)T is orthogonal to (1,0,1)T and (0,1,2)T.

We saw in Chapter 3 that the row space and the column space have the same dimension. If A has rank r, then

dim R(A)=dim R(AT)=r

Actually, A can be used to establish a one-to-one correspondence between R(AT) and R(A).

We can think of an m×n matrix A as a linear transformation from n to m:

xnAxm

Since R(AT) and N(A) are orthogonal complements in n,

n=R(AT)N(A)

Each vector xn can be written as a sum

x=y+z,yR(AT),zN(A)

It follows that

Ax=Ay+Az=Ayfor each xn

and hence

R(A)={Ax | xn}={Ay | y R(AT}

Thus, if we restrict the domain of A to R(AT), then A maps R(AT) onto R(A). Furthermore, the mapping is one-to-one. Indeed, if x1,x2R(AT) and

Ax1=Ax2

then

A(x1x2)=0

and hence

x1x2R(AT)N(A)

Since R(AT)N(A)={0}, it follows that x1=x2. Therefore, we can think of A as determining a one-to-one correspondence between R(AT) and R(A). Since each bR(A) corresponds to exactly one yR(AT), we can define an inverse transformation from R(A) to R(AT). Indeed, every m×n matrix A is invertible when viewed as a linear transformation from R(AT) to R(A).

Example 5

Let A=[200030].R(AT) is spanned by e1 and e2, and N(A) is spanned by e3. Any vector x3 can be written as a sum

x=y+z

where

y=(x1,x2,0)TR(AT)andz=(0,0,x3)TN(A)

If we restrict ourselves to vectors yR(AT), then

y=[x1x20]Ay=[2x13x2]

In this case, R(A)=2 and the inverse transformation from R(A) to R(AT) is defined by

b=[b1b2][12b113b20]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.0.53