Abstract
Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, not a priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.
Keywords
Affiliated Institutions
Related Publications
Principal component analysis
Abstract Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter‐correlated quantitative d...
Cross-Validatory Choice of the Number of Components From a Principal Component Analysis
A method is described for choosing the number of components to retain in a principal component analysis when the aim is dimensionality reduction. The correspondence between prin...
Partial least squares regression and projection on latent structure regression (PLS Regression)
Abstract Partial least squares (PLS) regression ( a.k.a. projection on latent structures) is a recent technique that combines features from and generalizes principal component a...
Introduction to Multivariate Analysis
Part One. Multivariate distributions. Preliminary data analysis. Part Two: Finding new underlying variables. Principal component analysis. Factor analysis. Part Three: Procedure...
REBUS‐PLS: A response‐based procedure for detecting unit segments in PLS path modelling
Abstract Structural equation models (SEMs) make it possible to estimate the causal relationships, defined according to a theoretical model, linking two or more latent complex co...
Publication Info
- Year
- 2016
- Type
- review
- Volume
- 374
- Issue
- 2065
- Pages
- 20150202-20150202
- Citations
- 8444
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1098/rsta.2015.0202