Abstract
Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 10 6 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.
Keywords
Affiliated Institutions
Related Publications
Nonlinear Dimensionality Reduction by Locally Linear Embedding
Many areas of science depend on exploratory data analysis and visualization. The need to analyze large amounts of multivariate data raises the fundamental problem of dimensional...
Dimensionality reduction for visualizing single-cell data using UMAP
Advances in single-cell technologies have enabled high-resolution dissection of tissue composition. Several tools for dimensionality reduction are available to analyze the large...
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a rep...
Introduction to Multivariate Analysis
Part One. Multivariate distributions. Preliminary data analysis. Part Two: Finding new underlying variables. Principal component analysis. Factor analysis. Part Three: Procedure...
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA
In this letter, we show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning p...
Publication Info
- Year
- 2000
- Type
- article
- Volume
- 290
- Issue
- 5500
- Pages
- 2319-2323
- Citations
- 13453
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1126/science.290.5500.2319