Abstract
Randomized algorithms for very large matrix problems have received a great deal of attention in recent years.Much of this work was motivated by problems in large-scale data analysis, largely since matrices are popular structures with which to model data drawn from a wide range of application domains, and this work was performed by individuals from many different research communities.While the most obvious benefit of randomization is that it can lead to faster algorithms, either in worst-case asymptotic theory and/or numerical implementation, there are numerous other benefits that are at least as important.For example, the use of randomization can lead to simpler algorithms that are easier to analyze or reason about when applied in counterintuitive settings; it can lead to algorithms with more interpretable output, which is of interest in applications where analyst time rather than just computational time is of interest; it can lead implicitly to regularization and more robust output; and randomized algorithms can often be organized to exploit modern computational architectures better than classical numerical methods.This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.Throughout this review, an emphasis will be placed on a few simple core ideas that underlie not only recent theoretical advances but also the usefulness of these tools in large-scale data applications.Crucial in this context is the connection with the concept of statistical leverage.This concept has long been used in statistical regression diagnostics to identify outliers; and it has recently proved crucial in the development of improved worst-case matrix algorithms that are also amenable to high-quality numerical implementation and that are useful to domain scientists.This connection arises naturally when one explicitly decouples the effect of randomization in these matrix algorithms from the underlying linear algebraic structure.This decoupling also permits much finer control in the application of randomization, as well as the easier exploitation of domain knowledge.Most of the review will focus on random sampling algorithms and random projection algorithms for versions of the linear least-squares problem and the low-rank matrix approximation problem.These two problems are fundamental in theory and ubiquitous in practice.Randomized methods solve these problems by constructing and operating on a randomized sketch of the input matrix Afor random sampling methods, the sketch consists of a small number of carefully-sampled and rescaled columns/rows of A, while for random projection methods, the sketch consists of a small number of linear combinations of the columns/rows of A. Depending on the specifics of the situation, when compared with the best previously-existing deterministic algorithms, the resulting randomized algorithms have worst-case running time that is asymptotically faster; their numerical implementations are faster in terms of clock-time; or they can be implemented in parallel computing environments where existing numerical algorithms fail to run at all.Numerous examples illustrating these observations will be described in detail.
Keywords
Affiliated Institutions
Related Publications
Robust Solutions to Least-Squares Problems with Uncertain Data
We consider least-squares problems where the coefficient matrices A, b are unknown but bounded. We minimize the worst-case residual error using (convex) second-order cone progra...
Determinant Maximization with Linear Matrix Inequality Constraints
The problem of maximizing the determinant of a matrix subject to linear matrix inequalities (LMIs) arises in many fields, including computational geometry, statistics, system id...
Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers
The method of Common Random Numbers is a technique used to reduce the variance of difference estimates in simulation optimization problems. These differences are commonly used t...
Stable signal recovery from incomplete and inaccurate measurements
Abstract Suppose we wish to recover a vector x 0 ∈ ℝ 𝓂 (e.g., a digital signal or image) from incomplete and contaminated observations y = A x 0 + e ; A is an 𝓃 × 𝓂 matrix wi...
Random Matrix Theory and Wireless Communications
Random matrix theory has found many applications in physics, statistics and engineering since its inception. Although early developments were motivated by practical experimental...
Publication Info
- Year
- 2012
- Type
- book-chapter
- Citations
- 447
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1201/b11822-37