Abstract

Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, not a priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.

Keywords

Principal component analysisInterpretabilityA priori and a posterioriCurse of dimensionalityUncorrelatedComputer scienceEigenvalues and eigenvectorsVariance (accounting)Dimensionality reductionSparse PCAData miningPrincipal (computer security)Component (thermodynamics)Artificial intelligenceMachine learningMathematicsStatistics

Affiliated Institutions

Related Publications

Principal component analysis

Abstract Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter‐correlated quantitative d...

2010 Wiley Interdisciplinary Reviews Compu... 9554 citations

Publication Info

Year
2016
Type
review
Volume
374
Issue
2065
Pages
20150202-20150202
Citations
8444
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

8444
OpenAlex

Cite This

Ian T. Jolliffe, Jorge Cadima (2016). Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences , 374 (2065) , 20150202-20150202. https://doi.org/10.1098/rsta.2015.0202

Identifiers

DOI
10.1098/rsta.2015.0202