8. IRLBA: Fast Partial Singular Value Decomposition Method (1/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

IRLBA: Fast Partial Singular Value

Decomposition Method

James Baglama

CONTENTS

8.1 Introduction .................................................................... 125

8.2 GKLB Procedure ............................................................... 126

8.3 Augmented Lanczos Bidiagonalization Method ............................... 128

8.4 Numerical Performance ........................................................ 132

8.5 Conclusion ...................................................................... 132

Appendix .............................................................................. 133

References ............................................................................. 134

8.1 Introduction

Approximation to some of the largest singular values and associated vectors (largest singular

triplets) of very large rectangular matrices is essential in many areas of data analysis, for

example, dimension reduction, data mining, data visualization, and detection of patterns.

Furthermore, the statistical procedure principal component analysis has a direct relationship

to the singular value decomposition (SVD) [16,26,31]. The importance of computing the

singular triplets of very large matrices has spurred numerous computational methods in the

literature; see, for example, [1,3,4,6,9,12–15,18,20,22,24,27,29,30] and references therein.

In 2005, Baglama and Reichel [1] developed a method, implicitly restarted Lanczos

bidiagonalization algorithm (IRLBA), that is well suited for computing some of the largest

(and smallest) singular triplets of extremely large rectangular matrices. The development

of this routine is based on a long history of methods in eigenvalue and singular value

computation. The foundation is based on the commonly used Golub–Kahan–Lanczos

bidiagonalization (GKLB) procedure [9], which is described in detail in Section 8.2. However,

the pivotal structure of the IRLBA comes from exploiting the mathematical equivalence

of two eigenvalue routines and extending them to a restarted GKLB procedure. In 1992,

Sorensen [28] developed a powerful eigenvalue computation method for very large eigenvalue

problems, which is often referred to as the implicitly restarted Arnoldi method (IRAM).

The IRAM was later implemented in the popular eigenvalue software package ARPACK

[22]. Morgan [25] showed that the IRAM can be implemented by augmenting the Krylov

subspace basis vectors by certain vectors. Such an implementation can be less sensitive to

propagated round-oﬀ errors than the IRAM. Wu and Simon [33] described an eigenvalue

method called the thick-restarted Lanczos tridiagonalization, which is simple to implement

125

126 Handb ook of Big Data

and mathematically equivalent to the symmetric IRAM. Harnessing these connections led

to the development of the IRLBA, a simple, powerful, and computationally fast partial

SVD method.

Shortly after developing the IRLBA, Baglama and Reichel [3] extended the method to

work with the block form of the GKLB procedure [10] and published MATLAB



computer

codes irlba and irlbablk [2]. Block methods are advantageous when computing multiple

or clustered singular values and can utilize Level 3 BLAS matrix–matrix products. Also,

block methods are favorable when matrices are so large that they have to be retrieved from

disk when matrix–vector products with them are to be evaluated. However, there are added

complexities to a block method, for example, linear dependence of blocks and block size.

We will not discuss block methods in this chapter.

Recently, additional software packages for the IRLBA were developed in the program-

ming languages R and Python. Lewis [23] developed The irlba Package in the open

source statistical programming language R and Kane and Lewis [17] created irlbpy,

an implementation of the IRLBA in Python for Numpy. See Section 8.4 for examples

demonstrating the performance of the software implementations.

Baglama and Reichel [1] developed two strategies: augmentation with Ritz vectors

irlba(R) for ﬁnding the largest (and smallest) singular triplets and augmentation with

harmonic Ritz vectors irlba(H) for ﬁnding the smallest singular triplets. When seeking

to compute the smallest singular triplets of a matrix, augmentation by harmonic Ritz

vectors, irlba(H), often yielded faster convergence. This method will not be described

in this chapter since our focus is in the context of large data and on computing the largest

singular triplets; therefore, we refer the reader to [1] for details on irlba(H). In this chapter,

we will refer to irlba(R) as IRLBA.

8.2 GKLB Procedure

Let A ∈ R

×n

be a large sparse matrix. We may assume that  ≥ n, because otherwise

we replace the matrix by its transpose. Let u

∈ R



and v

∈ R

denote the left and right

singular vectors of A associated with the singular value σ

. Deﬁne U

=[u

,...,u

] ∈

×n

and V

=[v

,...,v

] ∈ R

n×n

with orthonormal columns, as well as Σ

diag[σ

, σ

,...,σ

] ∈ R

n×n

.Then

= U

and A

= V

(8.1)

are the SVD of A and A

, respectively. We assume the singular values are ordered,

≥ σ

≥ ...≥ σ

≥ 0, (8.2)

and refer to {σ

} as a singular triplet of A.

We are interested in approximating the k largest singular triplets {σ

}

i=1

.Letthe

matrices U

∈ R

×k

and V

∈ R

n×k

consist of the ﬁrst k columns of the matrices U

and

in the SVD (Equation 8.1) of A, and introduce Σ

= diag[σ

,...,σ

] ∈ R

k×k

. Then,

analogously to Equation 8.1, we have the partial SVD

= U

and A

= V

. (8.3)

The approximations to Equation 8.3 can be obtained from projections onto Krylov subspaces

A, p

)=span{p

, (A

,...,(A

m−1

(AA

)=span{q

,AA

, (AA

)

,...,(AA

)

m−1

(8.4)

with initial vectors p

and q

= Ap

/Ap

, respectively. Throughout this discussion, ·

denotes the Euclidean vector norm as well as the associated induced matrix norm.

IRLBA 127

Generically, the GKLB procedure for m  min{, n} determines orthonormal bases

,...,p

} and {q

,...,q

} for the Krylov subspaces (Equation 8.4). The GKLB

procedure is well-suited for large matrices since the routine only requires the evaluation of

matrix–vector products with A and A

. The steps of the GKLB procedure is presented in

the GKLB algorithm. A matrix interpretation at step m of the computations of the GKLB

algorithm yields the partial GKLB decomposition,

= Q

= P

+ r

(8.5)

where the matrices P

=[p

,...,p

] ∈ R

n×m

and Q

=[q

,...,q

] ∈ R

×m

have

orthonormal columns, the residual vector r

∈ R

satisﬁes P

=0,ande

is the

axis vector of appropriate dimension. Further,

⎛

⎜

⎝

m−1

0 α

⎞

⎟

⎠

∈ R

m×m

(8.6)

is an upper bidiagonal matrix.

Algorithm 8.1 GKLB Procedure

Input: A ∈ R

×n

: large rectangular matrix,

∈ R

: initial vector of unit length,

m : number of bidiagonalization steps.

Output: P

:= [p

,...,p

] ∈ R

n×m

: matrix with orthonormal columns,

:= [q

,...,q

] ∈ R

×m

: matrix with orthonormal columns,

∈ R

m×m

: upper bidiagonal matrix (8.6) with entries α

and β

∈ R

: r esidual vector.

1. P

:= p

; q

:= Ap

;

2. α

:= q

; q

:= q

/α

; Q

:= q

;

3. for j =1:m

4. r

:= A

− α

;

5. Reortho gonalization: r

:= r

− P

);

6. if j<mthen

7. β

:= r

; p

j+1

:= r

/β

; P

j+1

:= [P

j+1

];

8. q

j+1

:= Ap

j+1

− β

;

9. Reortho gonalization: q

j+1

:= q

j+1

− Q

j+1

);

10. α

j+1

:= q

j+1

; q

j+1

:= q

j+1

/α

j+1

; Q

j+1

:= [Q

j+1

];

11. endif

12. endfor

The GKLB algorithm only requires matrix vector products with the matrices A and A

,and

the input matrix A can be replaced with functions that evaluate these products. To avoid

loss of orthogonality due to ﬁnite precision arithmetic, reorthogonalization is performed in

lines 5 and 9 of the GKLB algorithm, and the reader is referred to the reorthogonalization

strategies discussed in the literature, for example, [19,27,33]. We do remark that Simon and

Zha [27] observed that when matrix A is not very ill-conditioned, only the columns of one

128 Handb ook of Big Data

of the matrices P

or Q

need to be reorthogonalized, which reduces the computational

cost considerably when   n.

The number of bidiagonalization steps m  min{, n} is assumed to be small enough,

so that the partial GKLB decomposition (Equation 8.5) with the stated properties exists

and m>kso that an approximation to Equation 8.3 can be determined. We assume that

GKLB algorithm does not terminate early, that is, all α

> 0andβ

> 0for1≤ j ≤ m;

see [1] for a detailed discussion on early termination.

Analogously to Equation 8.1, let u

(B)

∈ R

and v

(B)

∈ R

denote the left and right

singular vectors of B

associated with the singular value σ

(B)

. Deﬁne U

(B)

=[u

(B)

,...,

(B)

] ∈ R

m×m

and V

(B)

=[v

(B)

,...,v

(B)

] ∈ R

m×m

with orthonormal columns, as well

as Σ

(B)

= diag[σ

(B)

, σ

(B)

,...,σ

(B)

] ∈ R

m×m

.Then

(B)

= U

(B)

and B

(B)

= V

(B)

(8.7)

are the SVD of B

and B

, respectively. We assume the singular values of B

are ordered

in the same manner as Equation 8.2.

Notice from Equations 8.5 and 8.7 that

(B)

− σ

(B)

− σ

(B)



(B)



(8.8)

Using Equation 8.8, we see that approximations to the singular triplets {σ

} of A

can be obtained from the triplets {σ

(B)

}. The criteria for accepting the

triplet {σ

(B)

} as an approximation are obtained from the second equation

of Equation 8.8:

A

(B)

− σ

(B)

 = |e

(B)

|·r

. (8.9)

The singular triplet is accepted as an approximate singular triplet of A when the residual

error (Equation 8.9) is smaller than a prescribed tolerance, that is, when

(B)

|·r

≤δA, (8.10)

for a user-speciﬁed value of δ. The value of A in the bound (Equation 8.10) is easily

approximated by the singular value of largest magnitude of the bidiagonal matrix B

.This

computation is of low cost, using successive approximations during iterations leading to a

fairly good estimate of A.

Typically, |e

(B)

|·r

 does not become smaller than a prescribed tolerance for

modest values of m. Prohibitive storage requirements of the basis vectors along with the

computational cost prevents increasing m to get better approximations. Therefore, m>k

is kept of modest size and a restarted algorithm is used.

8.3 Augmented Lanczos Bidiagonalization Method

An eﬃcient method for restarting in the context of eigenvalue problems was proposed by

Sorensen [28]. This method can be thought of as a curtailed QR algorithm and requires the

user to select a strategy for choosing a sequence of Krylov subspaces via a shift selection

used to determine the invariant subspace; see [28] for details. The recursion formulas in [28]

have been adapted to the GKLB method ﬁrst by Bj¨orck et al. [7] in 1994 to solve ill-posed

problems and then by Kokiopoulou et al. [18]. The shifts are taken from the spectrum of

IRLBA 129

A and are applied through a bugle and chase algorithm. However, this implementation is

prone to round-oﬀ errors (see [5,32]); therefore, a mathematically equivalent implementation

less prone to round-oﬀ errors was developed.

The relationship to the symmetric eigenvalue problems is obtained from the connection

to the Lanczos tridiagonal decomposition of the symmetric matrix A

A by multiplying the

ﬁrst equation in Equation 8.5 from the left side by A

= P

+ α

(8.11)

Since B

is upper triangular, this yields a valid Lanczos tridiagonal decomposition, where

is an m × m symmetric tridiagonal matrix. The eigenvalues and eigenvectors of

, referred to as Ritz values and vectors, can be used to approximate the eigenvalues

and eigenvectors of A

A; see for example [11, Section 9.1.2] for a discussion on Lanczos

tridiagonalization. Equation 8.11 provides a matrix representation of the Lanczos tridiagonal

decomposition method for creating an orthogonal basis for K

A, p

) in Equation 8.4. Wu

and Simon’s [33] mathematically equivalent approach to the symmetric IRAM of Sorensen

[28] is to augment K

m−k

A, p

k+1

) with Ritz vectors of B



(B)

,...,P

(B)



∈ R

n×k

(8.12)

where p

k+1

= r

/r

 and start the Lanczos tridiagonal method with the orthogonal

matrix

k+1



(B)

,...,P

(B)



∈ R

n×(k+1)

. (8.13)

Because of the relationship between the GKLB decomposition (Equation 8.5) and the

Lanczos decomposition (Equation 8.11), we can extend the idea of Wu and Simon [33]

to the GKLB method.

Multiplying the matrix (Equation 8.13) from the left by A and using Equation 8.8 yields

k+1

,Ap

k+1

]

⎡

⎢

⎣

(B)

⎤

⎥

⎦

, (8.14)

where



(B)

,...,Q

(B)



∈ R

×k

. (8.15)

The matrix, [

,Ap

k+1

], in Equation 8.14 does not have orthogonal columns since

(Ap

k+1

)

(B)

= ρ

(8.16)

where ρ

=(e

(B)

) ·r

. Using Equation 8.16, an orthonormal vector q

k+1

can be

created

k+1



k+1

−



i=1

(B)



k+1

(8.17)

with α

k+1

= q

k+1

, yielding the orthogonal matrix

k+1



(B)

,...,Q

(B)

k+1



∈ R

×(k+1)

. (8.18)

Equations 8.8, 8.13, 8.16, and 8.18 then yield

k+1

(8.19)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. IRLBA: Fast Partial Singular Value Decomposition Method (1/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
8. IRLBA: Fast Partial Singular Value Decomposition Method (1/3)