Structural Properties Underlying Randomized Linear Algebra Algorithms 147
• Construct an n × , with = O(k/), structured random projection matrix Ω, for
example, uniformly sample a few rows from a randomized Hadamard transform (see,
e.g., [31] for a precise definition of the randomized Hadamard transform).
• Return B = AΩ.
This algorithm, which amounts to choosing uniformly at random a small number of
columns in a randomly rotated basis, was introduced in [63], where it is proven that
A − P
B
k
A
F
≤ (1 + ) A − P
U
k
A
F
(9.6)
where P
B
k
A is the projection of A onto the best rank-k approximation of B,holdswith
high probability. This bound, which is the random projection analog of the relative-error
CUR matrix approximations of [30,50], provides a bound only on the reconstruction error
of the top part of the spectrum of the input matrix. Additionally, it necessitates sampling
a relatively large number of columns = O(k/).
In many practical applications, for example, when providing high-quality numerical
implementations, it is preferable to parameterize the problem in order to choose some
number = k + p columns, where p is a modest additive oversampling factor, for example,
p isequalto10or20ork. When attempting to be this aggressive at minimizing the size
of the sample, the choice of the oversampling factor p is quite sensitive to the input. That
is, whereas the bound of Equation 9.6 holds for any worst-case input, here the proper
choice for the oversampling factor p could depend on the matrix dimensions, the decay
properties of the spectrum, and the particular choice made for the random projection
matrix [43,44,47,52,61,69].
To deal with these issues, the best numerical implementations of RandNLA algorithms
for low-rank matrix approximation, and those that obtain the strongest results in terms
of minimizing p, take advantage of Lemma 9.2 in a somewhat different way than was
originally used in the analysis of the CSSP. For example, rather than choosing O(k log k)
dimensions and then filtering them through exactly k dimensions, one can choose some
number of dimensions and project onto a k
-dimensional subspace, where k<k
≤ , while
exploiting Lemma 9.2 to bound the error, as appropriate for the computational environment
at hand [44].
Consider, for example, the following random projection algorithm. Given A ∈ R
m×n
,a
rank parameter k, and an oversampling factor p:
• Set = k + p.
• Construct an n × random projection matrix Ω, either with i.i.d. Gaussian entries or
in the form of a structured random projection such as uniformly sampling a few rows
from a randomized Hadamard transform.
• Return B = AΩ.
Although this approach is quite similar to the algorithms of [58,63], algorithms parameter-
ized in this form were first introduced in [47,52,69], where a suite of bounds of the form
A − Z
2
10
min{m, n}A − A
k
2
were shown to hold with high probability. Here, Z is a rank-k-or-greater matrix, easily
constructed from B. Such results can be used to obtain the so-called interpolative
decomposition, a variant of the basic CSSP with explicit numerical conditioning properties,
and [47,52,69] also provided a posteriori error estimates that are useful in situations where
one wants to choose the rank parameter k to be the numerical rank, as opposed to apriori