Abstract

The Lasso, the Forward Stagewise regression and the Lars are closely re-lated procedures recently proposed for linear regression problems. Each of them can produce sparse models and can be used both for estimation and variable selection. In practical implementations these algorithms are typically tuned to achieve optimal prediction accuracy. We show that, when the predic-tion accuracy is used as the criterion to choose the tuning parameter, in general these procedures are not consistent in terms of variable selection. That is, the sets of variables selected are not consistent at finding the true set of important variables. In particular, we show that for any sample size n, when there are superfluous variables in the linear regression model and the design matrix is orthogonal, the probability of the procedures correctly identifying the true set of important variables is less than a constant (smaller than one) not depending on n. This result is also shown to hold for two dimensional problems with gen-eral correlated design matrices. The results indicate that in problems where

Keywords

Lasso (programming language)Design matrixFeature selectionLinear regressionVariable (mathematics)MathematicsSet (abstract data type)Selection (genetic algorithm)Regression analysisRegressionLinear modelModel selectionElastic net regularizationComputer scienceStatisticsMathematical optimizationArtificial intelligence

Affiliated Institutions

Related Publications

The Adaptive Lasso and Its Oracle Properties

The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this w...

2006 Journal of the American Statistical A... 7303 citations

Publication Info

Year
2006
Type
article
Citations
255
Access
Closed

External Links

Citation Metrics

255
OpenAlex

Cite This

Chenlei Leng, Yi Lin, Grace Wahba (2006). A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION. Scopus .