6.5 The Singular Value Decomposition

In many applications, it is necessary either to determine the rank of a matrix or to determine whether the matrix is deficient in rank. Theoretically, we can use Gaussian elimination to reduce the matrix to row echelon form and then count the number of nonzero rows. However, this approach is not practical in finite-precision arithmetic. If A is rank deficient and U is the computed echelon form, then, because of rounding errors in the elimination process, it is unlikely that U will have the proper number of nonzero rows. In practice, the coefficient matrix A usually involves some error. This may be due to errors in the data or to the finite number system. Thus, it is generally more practical to ask whether A is “close” to a rank-deficient matrix. However, it may well turn out that A is close to being rank deficient and the computed row echelon form U is not.

In this section, we assume throughout that A is an m×nm×n matrix with mnmn. (This assumption is made for convenience only; all the results will also hold if m<nm<n.) We will present a method for determining how close A is to a matrix of smaller rank. The method involves factoring A into a product UΣVTUΣVT, where U is an m×mm×m orthogonal matrix, V is an n×nn×n orthogonal matrix, and ΣΣ is an m×nm×n matrix whose off-diagonal entries are all 0’s and whose diagonal elements satisfy

σ1σ2σn0=[σ1σ2σn]
σ1σ2σn0=σ1σ2σn

The σiσi’s determined by this factorization are unique and are called the singular values of A. The factorization UVTUVT is called the singular value decomposition of A, or, for short, the svd of A. We will show that the rank of A equals the number of nonzero singular values, and that the magnitudes of the nonzero singular values provide a measure of how close A is to a matrix of lower rank.

We begin by showing that such a decomposition is always possible.

Theorem 6.5.1 The SVD Theorem

If A is an m×nm×n matrix, then A has a singular value decomposition.

Proof

ATAATA is a symmetric n×nn×n matrix. Therefore, its eigenvalues are all real and it has an orthogonal diagonalizing matrix V. Furthermore, its eigenvalues must all be nonnegative. To see this, let λλ be an eigenvalue of ATAATA and x be an eigenvector belonging to λλ. It follows that

Ax2=xTATAx=λxTx=λx2
Ax2=xTATAx=λxTx=λx2

Hence,

λ=Ax2x20
λ=Ax2x20

We may assume that the columns of V have been ordered so that the corresponding eigenvalues satisfy

λ1λ2λn0
λ1λ2λn0

The singular values of A are given by

σj=λjj=1,,n
σj=λjj=1,,n

Let r denote the rank of A. The matrix ATAATA will also have rank r. Since ATAATA is symmetric, its rank equals the number of nonzero eigenvalues. Thus,

λ1λ2λr>0andλr+1=λr+2==λn=0
λ1λ2λr>0andλr+1=λr+2==λn=0

The same relation holds for the singular values

σ1σ2σr>0andσr+1=σr+2==σn=0
σ1σ2σr>0andσr+1=σr+2==σn=0

Now let

V1=(v1,,vr),V2=(vr+1,,vn)
V1=(v1,,vr),V2=(vr+1,,vn)

and

1=[σ1σ2σr]
1=σ1σ2σr
(1)

Hence, 11 is an r×rr×r diagonal matrix whose diagonal entries are the nonzero singular values σ1,,σrσ1,,σr. The m×nm×n matrix is then given by

=[1OOO]
=[1OOO]

The column vectors of V2V2 are eigenvectors of ATAATA belonging to λ=0λ=0. Thus,

ATAvj=0j=r+1,,n
ATAvj=0j=r+1,,n

and, consequently, the column vectors of V2V2 form an orthonormal basis for N(ATA)=N(A)N(ATA)=N(A). Therefore,

AV2=O
AV2=O

and since V is an orthogonal matrix, it follows that

I=VVT=V1VT1+V2VT2A=AI=AV1VT1+AV2VT2=AV1VT1
IA=VVT=V1VT1+V2VT2=AI=AV1VT1+AV2VT2=AV1VT1
(2)

So far we have shown how to construct the matrices V and Σ of the singular value decomposition. To complete the proof, we must show how to construct an m×mm×m orthogonal matrix U such that

A=UΣVT
A=UΣVT

or, equivalently,

AV=UΣ
AV=UΣ
(3)

Comparing the first r columns of each side of (3), we see that

Avj=σjujj=1,,r
Avj=σjujj=1,,r

Thus, if we define

uj=1σjAvjj=1,,r
uj=1σjAvjj=1,,r
(4)

and

U1=(u1,,ur)
U1=(u1,,ur)

then it follows that

AV1=U1Σ1
AV1=U1Σ1
(5)

The column vectors of U1U1 form an orthonormal set since

uTiuj=(1σivTiAT)(1σiAvj)1ir,1jr=1σiσjvTi(ATAvj)=σiσivTivj=δij
uTiuj=(1σivTiAT)(1σiAvj)=1σiσjvTi(ATAvj)=σiσivTivj=δij1ir,1jr

It follows from (4) that each uj,1jruj,1jr, is in the column space of A. The dimension of the column space is r, so u1,,uru1,,ur form an orthonormal basis for R(A)R(A). The vector space R(A)=N(AT)R(A)=N(AT) has dimension mrmr. Let {ur+1,ur+2,,um}{ur+1,ur+2,,um} be an orthonormal basis for N(AT)N(AT) and set

U2=(ur+1,ur+2,,um)U=[U1U2]
U2=(ur+1,ur+2,,um)U=[U1U2]

It follows from Theorem 5.2.2 that u1,,umu1,,um form an orthonormal basis for mRm. Hence, U is an orthogonal matrix. We still must show that UΣVTUΣVT actually equals A. This follows from (5) and (2) since

UΣVT=[U1U2][Σ1OOO][VT1VT2]=U1Σ1VT1=AV1VT1=A
UΣVT=[U1U2][Σ1OOO][VT1VT2]=U1Σ1VT1=AV1VT1=A

Observations

Let A be an m×nm×n matrix with a singular value decomposition UΣVTUΣVT.

  1. The singular values σ1,,σnσ1,,σn of A are unique; however, the matrices U and V are not unique.

  2. Since V diagonalizes ATAATA, it follows that the vjvj’s are eigenvectors of ATAATA.

  3. Since AAT=UΣΣTUTAAT=UΣΣTUT, it follows that U diagonalizes AATAAT and that the ujuj’s are eigenvectors of AATAAT.

  4. Comparing the jth columns of each side of the equation

    AV=UΣ
    AV=UΣ

    we get

    Avj=σjujj=1,,n
    Avj=σjujj=1,,n

    Similarly,

    ATU=VΣT
    ATU=VΣT

    and hence

    ATuj=σjvjfor j=1,,nATuj=0for j=n+1,,m
    ATuj=σjvjATuj=0for j=1,,nfor j=n+1,,m

    The vjvj’s are called the right singular vectors of A, and the ujuj’s are called the left singular vectors of A.

  5. If A has rank r, then

    1. v1,,vrv1,,vr form an orthonormal basis for R(AT)R(AT).

    2. vr+1,,vnvr+1,,vn form an orthonormal basis for N(A)N(A).

    3. u1,,uru1,,ur form an orthonormal basis for R(A)R(A).

    4. ur+1,,umur+1,,um form an orthonormal basis for N(AT)N(AT).

  6. The rank of the matrix A is equal to the number of its nonzero singular values (where singular values are counted according to multiplicity). The reader should be careful not to make a similar assumption about eigenvalues. The matrix

    M=[0100001000010000]
    M=0000100001000010

    for example, has rank 3 even though all of its eigenvalues are 0.

  7. In the case that A has rank r<nr<n, if we set

    U1=(u1,u2,,ur)V1=(v1,v2,,vr)
    U1=(u1,u2,,ur)V1=(v1,v2,,vr)

    and define Σ1Σ1 as in equation (1), then

    A=U1Σ1VT1
    A=U1Σ1VT1
    (6)

    The factorization (6) is called the compact form of the singular value decomposition of A. This form is useful in many applications.

Example 1

Let

A=[111100]
A=110110

Compute the singular values and the singular value decomposition of A.

SOLUTION

The matrix

ATA=[2222]
ATA=[2222]

has eigenvalues λ1=4λ1=4 and λ2=0λ2=0. Consequently, the singular values of A are σ1=4=2σ1=4=2 and σ2=0σ2=0. The eigenvalue λ1λ1 has eigenvectors of the form α(1,1)Tα(1,1)T, and λ2λ2 has eigenvectors β(1,1)Tβ(1,1)T. Therefore, the orthogonal matrix

v=12[1111]
v=12[1111]

diagonalizes ATAATA. From observation 4, it follows that

u1=1σ1Av1=12[111100][1212]=[12120]
u1=1σ1Av1=121101101212=12120

The remaining column vectors of U must form an orthonormal basis for N(AT)N(AT). We can compute a basis {x2,x3}{x2,x3} for N(AT)N(AT) in the usual way.

x2=(1,1,0)Tandx3=(0,0,1)T
x2=(1,1,0)Tandx3=(0,0,1)T

Since these vectors are already orthogonal, it is not necessary to use the Gram–Schmidt process to obtain an orthonormal basis. We need only set

u2=1x2x2=(12,12,0)Tu3=x3=(0,0,1)T
u2=1x2x2=(12,12,0)Tu3=x3=(0,0,1)T

It then follows that

A=UΣVT=[1212012120001][200000][12121212]
A=UΣVT=121201212000120000012121212

Visualizing the SVD

If we view an m×nm×n matrix A with rank r as a mapping from the row space of A to the column space of A, then in light of observations (4) and (5) made earlier, it seems natural to choose v1,v2,,vrv1,v2,,vr as an orthonormal basis for the row space, since the image vectors

Av1=σ1u1,Av2=σ2u2,,Avr=σrur
Av1=σ1u1,Av2=σ2u2,,Avr=σrur

are mutually orthogonal and the corresponding unit vectors u1,u2,,uru1,u2,,ur will form an orthonormal basis for the column space of A. In the case of a 2×22×2 matrix, the following example illustrates geometrically how one could search for the right singular vectors by moving around the unit circle.

Example 2

Let

A=[0.40.30.91.2]
A=[0.40.90.31.2]

To find a pair of right singular vectors of A, we must find a pair of orthonormal vectors x and y for which the image vectors Ax and Ay are orthogonal. Choosing the standard basis vectors for 2R2 does not work, for if x=e1x=e1 and y=e2y=e2, then the image vectors

Ae1=a1=[0.40.9]andAe2=a2=[0.31.2]
Ae1=a1=[0.40.9]andAe2=a2=[0.31.2]

are not orthogonal. See Figure 6.5.1.

Figure 6.5.1.

Four vectors, A y, y, A x, and x, that originate from the origin. The x axis and the y axis are in the increments of 0.5 from negative 2 to 2.

One way to search for the right singular vectors is to simultaneously rotate this initial pair of vectors around the unit circle and for each rotated pair

x1=[cos tsint],y=[sintcos t]
x1=[cos tsint],y=[sintcos t]

check to see if Ax and Ay are orthogonal. For the given matrix A, this will happen when the tip of our initial x vector gets rotated to the point (0.6,0.8). It follows that the right singular vectors are

v1=[0.60.8],v2=[0.80.6]
v1=[0.60.8],v2=[0.80.6]

Since

Av1=[01.5]=1.5e2,andAv2=[0.50]=0.5e1
Av1=[01.5]=1.5e2,andAv2=[0.50]=0.5e1

it follows that the singular values are σ1=1.5σ1=1.5 and σ2=0.5σ2=0.5, and the left singular vectors are u1=e2u1=e2 and u2=e1u2=e1. See Figure 6.5.2.

Figure 6.5.2.

Four vectors, A v sub 1, v sub 1, A v sub 2, and v sub 2, that originate from the origin. The x axis and the y axis are in the increments of 0.5 from negative 2 to 2.

Numerical Rank and Lower Rank Approximations

If A is an m×nm×n matrix of rank r and 0<k<r0<k<r, we can use the singular value decomposition to find a matrix in m×nRm×n of rank k that is closest to A with respect to the Frobenius norm. Let ? be the set of all m×nm×n matrices of rank k or less. It can be shown that there is a matrix X in ? such that

AXF=minS?ASF
AXF=minS?ASF
(7)

We will not prove this result, since the proof is beyond the scope of this text. Assuming that the minimum is achieved, we will show how such a matrix X can be derived from the singular value decomposition of A. The following lemma will be useful.

Lemma 6.5.2

If A is an m×nm×n matrix and Q is an m×mm×m orthogonal matrix, then

QAF=AF
QAF=AF

Proof

QA2F=(Qa1,Qa2,,Qan)2F=ni=1Qai22=ni=1ai22=A2F
QA2F=(Qa1,Qa2,,Qan)2F=i=1nQai22=i=1nai22=A2F

If A has singular value decomposition UΣVTUΣVT, then it follows from the lemma that

AF=ΣVTF
AF=ΣVTF

Since

ΣVTF=(ΣVT)TF=VΣTF=ΣTF
ΣVTF=(ΣVT)TF=VΣTF=ΣTF

It follows that

AF=(σ21+σ22++σ2n)1/2
AF=(σ21+σ22++σ2n)1/2

Theorem 6.5.3

Let A=UΣVTA=UΣVT be an m×nm×n matrix, and let ? denote the set of all m×nm×n matrices of rank k or less, where 0<k<rank (A)0<k<rank (A). If X is a matrix in ? satisfying (7), then

AXF=(σ2k+1+σ2k+2++σ2n)1/2
AXF=(σ2k+1+σ2k+2++σ2n)1/2

In particular, if A=UΣVTA=UΣVT, where

designer icon

then

AAF=(σ2k+1++σ2n)1/2=minSMASF
AAF=(σ2k+1++σ2n)1/2=minSMASF

Proof

Let X be a matrix in M satisfying (7). Since A?, it follows that

AXFAAF=(σ2k+1++σ2n)1/2
(8)

We will show that

AXF=(σ2k+1++σ2n)1/2

and hence that equality holds in (8). Let QΩPT be the singular value decomposition of X, where

designer icon

If we set B=QTAP, then A=QBPT, and it follows that

AXF=Q(BΩ)PTF=BΩF

Let us partition B in the same manner as Ω.

designer icon

It follows that

AX2F=B11Ωk2F+B122F+B212F+B222F

We claim that B12=O. If not, then define

Y=Q[B11B12OO]PT

The matrix Y is in ? and

AY2F=B212F+B222F<AX2F

But this contradicts the definition of X. Therefore, B12=O. In a similar manner, it can be shown that B21 must equal O. If we set

Z=Q[B11OOO]PT

then ZM and

AZ2F=B222FB11Ωk2F+B222FAX2F

It follows from the definition of X that B11 must equal ΩK. If B22 has singular value decomposition U1ΛVT1, then

AXF=B22F=ΛF

Let

U2=[IkOOU1]andV2=[IkOOV1]

Now,

UT2QTAPV2=[ΩkOOΛ]A=(QU2)[ΩkOOΛ](PV2)T

and hence it follows that the diagonal elements of Λ are singular values of A. Thus,

AXF=ΛF(σ2k+1++σ2n)1/2

It then follows from (8) that

AXF=(σ2k+1++σ2n)1/2=AAF

If A has singular value decomposition UΣVT, then we can think of A as the product of UΣ times VT. If we partition UΣ into columns and VT into rows, then

UΣ=(σ1u1,σ2u2,,σun)

and we can represent A by an outer product expansion

A=σ1u1vT1+σ2u2vT2++σnunvTn
(9)

If A is of rank n, then

A=U[σ1σ2σn10]VT=σ1u1vT1+σ2u2vT2++σn1un1vTn1

will be the matrix of rank n1 that is closest to A with respect to the Frobenius norm. Similarly,

A=σ1u1vT1+σ2u2vT2+σn2un2vTn2

will be the nearest matrix of rank n2, and so on. In particular, if A is a nonsingular n×n matrix, then A is singular and AAF=σn. Thus, σn may be taken as a measure of how close a square matrix is to being singular.

The reader should be careful not to use the value of det(A) as a measure of how close A is to being singular. If, for example, A is the 100×100 diagonal matrix whose diagonal entries are all 12, then det(A)=2100; however, σ100=12. By contrast, the matrix in the next example is very close to being singular even though its determinant is 1 and all its eigenvalues are equal to 1.

Example 3

Let A be an n×n upper triangular matrix whose diagonal elements are all 1 and whose entries above the main diagonal are all −1:

A=[1111101111001110001100001]

Notice that detdet(A)=det(A1)=1 and all the eigenvalues of A are 1. However, if n is large, then A is close to being singular. To see this, let

B=[1111101111001110001112n20001]

The matrix B must be singular, since the system Bx=0 has a nontrivial solution x=(2n2,2n3,,20,1)T. Since the matrices A and B differ only in the (n,1) position, we have

ABF=12n2

It follows from Theorem 6.5.3 that

σn=minX singularAXFABF=12n2

Thus, if n=100, then σn1/298 and, consequently, A is very close to singular.

Example 4

Suppose that A is a 5×5 matrix with singular values

σ1=4,σ2=1,σ3=1012,σ4=3.1×1014,σ5=2.6×1015

and suppose that the machine epsilon is 5×1015. To determine the numerical rank, we compare the singular values to

σ1 max(m,n)=4.5.5×1015=1013

Since three of the singular values are greater than 1013, the matrix has numerical rank 3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.190.156.80