5.5 Orthonormal Sets

In 2R2, it is generally more convenient to use the standard basis {e1,e2}{e1,e2} than to use some other basis, such as {(2,1)T,(3,5)T}{(2,1)T,(3,5)T}. For example, it would be easier to find the coordinates of (x1,x2)T(x1,x2)T with respect to the standard basis. The elements of the standard basis are orthogonal unit vectors. In working with an inner product space V, it is generally desirable to have a basis of mutually orthogonal unit vectors. Such a basis is convenient not only in finding coordinates of vectors, but also in solving least squares problems.

Example 1

The set {(1,1,1)T,(2,1,3)T,(4,5,1)T}{(1,1,1)T,(2,1,3)T,(4,5,1)T} is an orthogonal set in 3R3, since

(1,1,1)(2,1,3)T=0(1,1,1)(4,5,1)T=0(2,1,3)(4,5,1)T=0
(1,1,1)(2,1,3)T(1,1,1)(4,5,1)T(2,1,3)(4,5,1)T=0=0=0

Theorem 5.5.1

If {v1,v2,,vn}{v1,v2,,vn} is an orthogonal set of nonzero vectors in an inner product space V, then v1,v2,,vnv1,v2,,vn are linearly independent.

Proof

Suppose that v1,v2,,vnv1,v2,,vn are mutually orthogonal nonzero vectors and

c1v1+c2v2++cnvn=0
c1v1+c2v2++cnvn=0
(1)

If 1jn1jn, then, taking the inner product of vjvj with both sides of equation (1), we see that

c1vj,v1+c2vj,v2++cnvj,vn=0cjvj2=0
c1vj,v1+c2vj,v2++cnvj,vn=0cjvj2=0

and hence all the scalars c1,c2,,cnc1,c2,,cn must be 0.

The set {u1,u2,,un}{u1,u2,,un} will be orthonormal if and only if

ui,uj=δij
ui,uj=δij

where

δij={1ifi=j0ifij
δij={10ififi=jij

Given any orthogonal set of nonzero vectors {v1,v2,vn}{v1,v2,vn}, it is possible to form an orthonormal set by defining

ui=(1vi)vifori=1,2,,n
ui=(1vi)vifori=1,2,,n

The reader may verify that {u1,u2,,un}{u1,u2,,un} will be an orthonormal set.

Example 2

We saw in Example 1 that if v1=(1,1,1)T,v2=(2,1,3)Tv1=(1,1,1)T,v2=(2,1,3)T, and v3=(4,5,1)Tv3=(4,5,1)T, then {v1,v2,v3}{v1,v2,v3} is an orthogonal set in 3R3. To form an orthonormal set, let

u1=(1v1)v1=13(1,1,1)Tu2=(1v2)v2=114(2,1,3)Tu3=(1v3)v3=142(4,5,1)T
u1u2u3=(1v1)=(1v2)=(1v3)v1=13(1,1,1)Tv2=114(2,1,3)Tv3=142(4,5,1)T

Example 3

In C[π,π]C[π,π] with inner product

f,g=1πππf(x)g(x)dx
f,g=1πππf(x)g(x)dx
(2)

the set {1, cos x, cos 2x, …, cos nx} is an orthogonal set of vectors, since for any positive integers j and k

1,coskx=1πππcos kx dx=0cosjx,coskx=1πππcosjx cos kx dx=0(jk)
1,coskxcosjx,coskx=1πππcos kx dx=0=1πππcosjx cos kx dx=0(jk)

The functions cos x, cos 2x, …, cos nx are already unit vectors since

coskx,coskx=1πππcos2kx dx=1for  k=1,2,,n
coskx,coskx=1πππcos2kx dx=1for  k=1,2,,n

To form an orthonormal set, we need only find a unit vector in the direction of 1.

12=1,1=1πππ1 dx=2
12=1,1=1πππ1 dx=2

Thus, 1/21/2 is a unit vector, and hence {1/2,cosx,cos2x,,cosnx}{1/2,cosx,cos2x,,cosnx} is an ortho-normal set of vectors.

It follows from Theorem 5.5.1 that if B={u1,u2,,uk}B={u1,u2,,uk} is an orthonormal set in an inner product space V, then B is a basis for the subspace S=Span(u1,u2,,uk)S=Span(u1,u2,,uk). We say that B is an orthonormal basis for S. It is generally much easier to work with an orthonormal basis than with an ordinary basis. In particular, it is much easier to calculate the coordinates of a given vector v with respect to an orthonormal basis. Once these coordinates have been determined, they can be used to compute vv.

Theorem 5.5.2

Let {u1,u2,,un}{u1,u2,,un} be an orthonormal basis for an inner product space V. If v=nΣi=1ciuiv=Σi=1nciui, then ci=v,uici=v,ui.

Proof

v,ui=nj=1cjuj,uj=nj=1cjuj,ui=nj=1cjδji=ci
v,ui=j=1ncjuj,uj=j=1ncjuj,ui=j=1ncjδji=ci

As a consequence of Theorem 5.5.2, we can state two more important results.

Corollary 5.5.3

Let {u1,u2,,un}{u1,u2,,un} be an orthonormal basis for an inner product space V. If u=nΣi=1aiuiu=Σi=1naiui and v=nΣi=1biuiv=Σi=1nbiui then

u,v=ni=1aibi
u,v=i=1naibi

Proof

By Theorem 5.5.2,

v,ui=bii=1,,n
v,ui=bii=1,,n

Therefore,

u,v=ni=1aiui,v=ni=1aiui,v=ni=1aiv,ui=ni=1aibi
u,v=i=1naiui,v=i=1naiui,v=i=1naiv,ui=i=1naibi

Corollary 5.5.4 Parseval’s Formula

If {u1,,un}{u1,,un} is an orthonormal basis for an inner product space V and v=nΣi=1ciuiv=Σi=1nciui, then

v2=ni=1c2i
v2=i=1nc2i

Proof

If v=nΣi=1ciuiv=Σi=1nciui, then, by Corollary 5.5.3,

v2=v,v=ni=1c2i
v2=v,v=i=1nc2i

Example 4

The vectors

u1=(12,12)Tandu2=(12,12)T
u1=(12,12)Tandu2=(12,12)T

form an orthonormal basis for 2R2. If x2xR2, then

xTu1=x1+x22andxTu2=x1x22
xTu1=x1+x22andxTu2=x1x22

It follows from Theorem 5.5.2 that

x=x1+x22u1+x1x22u2
x=x1+x22u1+x1x22u2

and It follows from Corollary 5.5.4 that

x2=(x1+x22)2+(x1x22)2=x21+x22
x2=(x1+x22)2+(x1x22)2=x21+x22

Example 5

Given that {1/2,cos2x}{1/2,cos2x} is an orthonormal set in C[π,π]C[π,π] (with an inner product as in Example 3), determine the value of ππsin4xdxππsin4xdx without computing antiderivatives.

SOLUTION

Since

sin2x=1cos 2x2=1212+(12)cos2x
sin2x=1cos 2x2=1212+(12)cos2x

it follows from Parseval’s formula that

ππsin4xdx=πsin2x2=π(12+14)=3π4
ππsin4xdx=πsin2x2=π(12+14)=3π4

Orthogonal Matrices

Of particular importance are n×nn×n matrices whose column vectors form an orthonormal set in nRn.

Theorem 5.5.5

An n×nn×n matrix Q is orthogonal if and only if QTQ=IQTQ=I.

Proof

It follows from the definition that an n×nn×n matrix Q is orthogonal if and only if its column vectors satisfy

qTiqj=δij
qTiqj=δij

However, qTiqjqTiqj is the (i, j) entry of the matrix QTQQTQ. Thus, Q is orthogonal if and only if QTQ=IQTQ=I.

It follows from the theorem that if Q is an orthogonal matrix, then Q is invertible and Q1=QTQ1=QT.

Example 6

For any fixed θθ, the matrix

Q=[cosθsinθsinθcosθ]
Q=[cosθsinθsinθcosθ]

is orthogonal and

Q1=QT=[cosθsinθsinθcosθ]
Q1=QT=[cosθsinθsinθcosθ]

The matrix Q in Example 6 can be thought of as a linear transformation from R2 onto 2R2 that has the effect of rotating each vector by an angle θθ while leaving the length of the vector unchanged (see Example 2 in Section 4.2). Similarly, Q1Q1 can be thought of as a rotation by the angle θθ (see Figure 5.5.1).

Figure 5.5.1.

Two vector diagrams, a and b, have two vectors, each.

In general, inner products are preserved under multiplication by an orthogonal matrix [i.e., x,y=Qx,Qyx,y=Qx,Qy ]. Indeed,

Qx,Qy=(Qy)TQx=yTQTx=yTx=x,y
Qx,Qy=(Qy)TQx=yTQTx=yTx=x,y

In particular, if x=yx=y, then Qx2=x2Qx2=x2 and hence Qx=xQx=x. Multiplication by an orthogonal matrix preserves the lengths of vectors.

Properties of Orthogonal Matrices

If Q is an n×nn×n orthogonal matrix, then

  1. the column vectors of Q form an orthonormal basis for nRn

  2. QTQ=IQTQ=I

  3. QT=Q1QT=Q1

  4. Qx,Qy=x,yQx,Qy=x,y

  5. Qx2=x2Qx2=x2

Permutation Matrices

A permutation matrix is a matrix formed from the identity matrix by reordering its columns. Clearly, then, permutation matrices are orthogonal matrices. If P is the permutation matrix formed by reordering the columns of I in the order (k1,,kn)(k1,,kn), then P=(ek1,,ekn)P=(ek1,,ekn). If A is an m×nm×n matrix, then

AP=(Aek1,,Aekn)=(ak1,,akn)
AP=(Aek1,,Aekn)=(ak1,,akn)

Postmultiplication of A by P reorders the columns of A in the order (k1,,kn)(k1,,kn). For example, if

A=[123123]andP=[010001100]
A=[112233]andP=001100010

then

AP=[312312]
AP=[331122]

Since P=(ek1,……,ekn)P=(ek1,……,ekn) is orthogonal, it follows that

P1=PT[eTk1eTkn]
P1=PTeTk1eTkn

The k1k1 column of PTPT will be e1e1, the k2k2 column will be e2e2, and so on. Thus, PTPT is a permutation matrix. The matrix PTPT can be formed directly from I by reordering its rows in the order (k1,k2,,kn)(k1,k2,,kn). In general, a permutation matrix can be formed from I by reordering either its rows or its columns.

If Q is the permutation matrix formed by reordering the rows of I in the order (k1,k2,,kn)(k1,k2,,kn) and B is an n×rn×r matrix, then

QB=[eTk1eTkn]B=[eTk1BeTknB]=[bk1bkn]
QB=eTk1eTknB=eTk1BeTknB=bk1bkn

Thus, QB is the matrix formed by reordering the rows of B in the order (k1,k2,,kn)(k1,k2,,kn). For example, if

Q=[001100010]andB=[112233]
Q=010001100andB=123123

then

QB=[331122]
QB=312312

In general, if P is an n×nn×n permutation matrix, premultiplication of an n×rn×r matrix B by P reorders the rows of B and postmultiplication of an m×nm×n matrix A by P reorders the columns of A.

Orthonormal Sets and Least Squares

Orthogonality plays an important role in solving least squares problems. Recall that if A is an m×nm×n matrix of rank n, then the least squares problem Ax=bAx=b has a unique solution ˆxxˆ that is determined by solving the normal equations ATAx=ATbATAx=ATb. The projection p=Aˆxp=Axˆ is the vector in R(A) that is closest to b. The least squares problem is especially easy to solve in the case where the column vectors of A form an orthonormal set in mRm.

Theorem 5.5.6

If the column vectors of A form an orthonormal set of vectors in mRm, then ATA=IATA=I and the solution to the least squares problem is

ˆx=ATb
xˆ=ATb

Proof

The (i, j) entry of ATAATA is formed from the ith row of ATAT and the jth column of A. Thus, the (i, j) entry is actually the scalar product of the ith and jth columns of A. Since the column vectors of A are orthonormal, it follows that

ATA=(δij)=I
ATA=(δij)=I

Consequently, the normal equations simplify to

x=ATb
x=ATb

What if the columns of A are not orthonormal? In the next section, we will learn a method for finding an orthonormal basis for R(A). From this method, we will obtain a factorization of A into a product QR, where Q has an orthonormal set of column vectors and R is upper triangular. With this factorization, the least squares problem is easily solved.

If we have an orthonormal basis for R(A), the projection p = Aˆxxˆ can be determined in terms of the basis elements. Indeed, this is a special case of the more general least squares problem of finding the element p in a subspace S of an inner product space V that is closest to a given element x in V. This problem is easily solved if S has an orthonormal basis. We first prove the following theorem.

Theorem 5.5.7

Let S be a subspace of an inner product space V and let xVxV. Let {u1,u2,,,un}{u1,u2,,,un} be an orthonormal basis for S. If

p=ni=1ciui
p=i=1nciui
(3)

where

ci=x,uifor each i
ci=x,uifor each i
(4)

then pxSpxS (see Figure 5.5.2).

Figure 5.5.2.

Three vectors form a right triangle on a plane, S.

Proof

We will show first that (px)ui(px)ui for each i.

ui,px=ui,pui,x=ui,nj=1cjujci=nj=1cjui,ujci=0
ui,px====ui,pui,xui,j=1ncjujcij=1ncjui,ujci0

So pxpx is orthogonal to all the uisui's. If ySyS, then

y=ni=1αiui
y=i=1nαiui

and hence

px,y=px,ni=1αiui=ni=1αipx,ui=0
px,y=px,i=1nαiui=i=1nαipx,ui=0

If xSxS, the preceding result is trivial, since by Theorem 5.5.2, px=0px=0. If xSxS, then p is the element in S closest to x.

Theorem 5.5.8

Under the hypothesis of Theorem 5.5.7, p is the element of S that is closest to x; that is,

yx>px
yx>px

for any ypyp in S.

Proof

If ySyS and ypyp, then

yx2=(yp)+(px)2
yx2=(yp)+(px)2

Since ypSypS, it follows from Theorem 5.5.7 and the Pythagorean law that

yx2=yp2+px2>px2
yx2=yp2+px2>px2

Therefore, yx>pxyx>px.

The vector p defined by (3) and (4) is said to be the projection of x onto S.

Corollary 5.5.9

Let S be a nonzero subspace of mRm and let bmbRm. If {u1,u2,,uk}{u1,u2,,uk} is an orthonormal basis for S and U=(u1,u2,,uk)U=(u1,u2,,uk), then the projection p of b onto S is given by

p=UUTb
p=UUTb

Proof

It follows from Theorem 5.5.7 that the projection p of b onto S is given by

p=c1u1+c2u2++ckuk=Uc
p=c1u1+c2u2++ckuk=Uc

where

c=[c1c2ck]=[uT1buT2buTkb]=UTb
c=c1c2ck=uT1buT2buTkb=UTb

Therefore,

p=UUTb
p=UUTb

The matrix UUTUUT in Corollary 5.5.9 is the projection matrix corresponding to the subspace S of mRm. To project any vector bmbRm onto S, we need only find an orthonormal basis {u1,u2,,uk}{u1,u2,,uk} for S, form the matrix UUTUUT, and then multiply UUTUUT times b.

If P is a projection matrix corresponding to a subspace S of mRm, then, for any bmbRm, the projection p of b onto S is unique. If Q is also a projection matrix corresponding to S, then

Qb=p=Pb
Qb=p=Pb

It then follows that

qj=Qej=Pej=Pej=pjfor j=1,,m
qj=Qej=Pej=Pej=pjfor j=1,,m

and hence Q=PQ=P. Thus, the projection matrix for a subspace S of mRm is unique.

Example 7

Let S be the set of all vectors in 3R3 of the form (x,y,0)T(x,y,0)T. Find the vector p in S that is closest to w=(5,3,4)Tw=(5,3,4)T (see Figure 5.5.3).

Figure 5.5.3.

The graph of x y z plane has two vectors rising from the origin. Vector w rises from the origin to the terminal point (5, 3, 4). The other vector falls from the origin to the terminal point (5, 3, 0). A dotted line is drawn from (5, 3, 4) to (5, 3, 0).

SOLUTION

Let u1=(1,0,0)Tu1=(1,0,0)T and u2=(0,1,0)Tu2=(0,1,0)T. Clearly, u1u1 and u2u2 form an orthonormal basis for S. Now

c1=wTu1=5c2=wTu2=3
c1c2==wTu1=5wTu2=3

The vector p turns out to be exactly what we would expect:

p=5u1+3u2=(5,3,0)T
p=5u1+3u2=(5,3,0)T

Alternatively, p could have been calculated using the projection matrix UUTUUT.

p=UUTw=[100010000] [534]=[530]
p=UUTw=100010000 534=530

Approximation of Functions

In many applications, it is necessary to approximate a continuous function in terms of functions from some special type of approximating set. Most commonly, we approximate by a polynomial of degree n or less. We can use Theorem 5.5.8 to obtain the best least squares approximation.

Example 8

Find the best least squares approximation to exex on the interval [0, 1] by a linear function.

SOLUTION

Let S be the subspace of all linear functions in C[0, 1]. Although the functions 1 and x span S, they are not orthogonal. We seek a function of the form xaxa that is orthogonal to 1.

1,xa=10(xa)dx=12a
1,xa=10(xa)dx=12a

Thus, a=12a=12. Since x12=1/12x12=1/12, it follows that

u1(x)=1andu2(x)=12(x12)
u1(x)=1andu2(x)=12(x12)

form an orthonormal basis for S.

Let

c1=10u1(x)exdx=e1c2=10u2(x)exdx=3(3e)
c1c2==10u1(x)exdx=e110u2(x)exdx=3(3e)

The projection

P(x)=c1u1(x)+c2u1(x)=(e1)1+3(3e)[12(x12)]=(4e10)+6(3e)x
P(x)===c1u1(x)+c2u1(x)(e1)1+3(3e)[12(x12)](4e10)+6(3e)x

is the best linear least squares approximation to exex on [0, 1] (see Figure 5.5.4).

Figure 5.5.4.

A graph plots a concave up increasing curve, y = e to the x power, that rises from (0, 1.0) to (1.0, 2.9). A tangent line, y = p of x, rises from (0, 0.9) to (1.1, 2.6). All values estimated.

Approximation by Trigonometric Polynomials

Trigonometric polynomials are used to approximate periodic functions. By a trigonometric polynomial of degree n, we mean a function of the form

tn(x)=a02+nk=1(ak coskx+bk sinkx)
tn(x)=a02+k=1n(ak coskx+bk sinkx)

We have already seen that the collection of functions

12,cosx,cos 2x,,cosnx
12,cosx,cos 2x,,cosnx

forms an orthonormal set with respect to the inner product (2). We leave it to the reader to verify that if the functions

sinx,sin2x,,sinnx
sinx,sin2x,,sinnx

are added to the collection, it will still be an orthonormal set. Thus, we can use Theorem 5.5.8 to find the best least squares approximation to a continuous 2π periodic function f (x) by a trigonometric polynomial of degree n or less. Note that

f,1212=f,112
f,1212=f,112

so that if

a0=f,1=1πππf(x)dx
a0=f,1=1πππf(x)dx

and

ak=f,coskx=1πππf(x)cos kx dxbk=f,sinkx=1πππf(x)sinkx dx
akbk==f,coskx=1πππf(x)cos kx dxf,sinkx=1πππf(x)sinkx dx

for k=1,2,,nk=1,2,,n, then these coefficients determine the best least squares approximation to f. The ak’sak’s and the bk’sbk’s turn out to be the well-known Fourier coefficients that occur in many applications involving trigonometric series approximations of functions.

Let us think of f (x) as representing the position at time x of an object moving along a line, and let tntn be the Fourier approximation of degree n to f. If we set

rk=a2k+b2kandθk=Tan1(bkak)
rk=a2k+b2kandθk=Tan1(bkak)

then

akcoskx+bk sinkx=rk(akrkcoskx+bkrksinkx)=rkcos(kxθk)
akcoskx+bk sinkx==rk(akrkcoskx+bkrksinkx)rkcos(kxθk)

Thus, the motion f (x) is being represented as a sum of simple harmonic motions.

For signal-processing applications, it is useful to express the trigonometric approximation in complex form. To this end, we define complex Fourier coefficients ckck in terms of the real Fourier coefficients akak and bkbk:

ck=12(akibk)=12πππf(x)(coskxi sinkx)dx=12πππf(x)ekxdx    (k0)
ck=12(akibk)==12πππf(x)(coskxi sinkx)dx12πππf(x)ekxdx    (k0)

The latter equality follows from the identity

eiθ=cosθ+i sinθ
eiθ=cosθ+i sinθ

We also define the coefficient CkCk to be the complex conjugate of CkCk. Thus,

Ck=¯Ck=12(ak+ibk)(k0)
Ck=Ck¯¯¯¯=12(ak+ibk)(k0)

Alternatively, if we solve for akak and bkbk, then

ak=ck+ckandbk=i(ckck)
ak=ck+ckandbk=i(ckck)

From these identities, it follows that

ckekx+ckeikx=(ck+ck)coskx+i(ckck)sinkx=akcoskx+bksinkx
ckekx+ckeikx==(ck+ck)coskx+i(ckck)sinkxakcoskx+bksinkx

and hence the trigonometric polynomial

tn(x)=a02+nk=1(ak coskx+bk sinkx)
tn(x)=a02+k=1n(ak coskx+bk sinkx)

can be rewritten in complex form as

tn(x)=nk=nckeikx
tn(x)=k=nnckeikx
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.171.20