Abstract

Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming–based relaxation for our problem. We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem. The method has complexity $O(n^4 \sqrt{\log(n)}/\epsilon)$, where n is the size of the underlying covariance matrix and $\epsilon$ is the desired absolute accuracy on the optimal value of the problem.

Keywords

Semidefinite programmingSparse PCAMathematicsRelaxation (psychology)Eigenvalues and eigenvectorsPositive-definite matrixApplied mathematicsSparse approximationMatrix (chemical analysis)Covariance matrixMathematical optimizationCovariancePrincipal component analysisLinear programmingEigendecomposition of a matrixCombinatoricsAlgorithmStatistics

Related Publications

Decoding by Linear Programming

This paper considers a natural error correcting problem with real valued input/output. We wish to recover an input vector f/spl isin/R/sup n/ from corrupted measurements y=Af+e....

2005 IEEE Transactions on Information Theory 7166 citations

Publication Info

Year
2007
Type
article
Volume
49
Issue
3
Pages
434-448
Citations
666
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

666
OpenAlex

Cite This

Alexandre d’Aspremont, Laurent El Ghaoui, Michael I. Jordan et al. (2007). A Direct Formulation for Sparse PCA Using Semidefinite Programming. SIAM Review , 49 (3) , 434-448. https://doi.org/10.1137/050645506

Identifiers

DOI
10.1137/050645506