Abstract
The Lasso, the Forward Stagewise regression and the Lars are closely re-lated procedures recently proposed for linear regression problems. Each of them can produce sparse models and can be used both for estimation and variable selection. In practical implementations these algorithms are typically tuned to achieve optimal prediction accuracy. We show that, when the predic-tion accuracy is used as the criterion to choose the tuning parameter, in general these procedures are not consistent in terms of variable selection. That is, the sets of variables selected are not consistent at finding the true set of important variables. In particular, we show that for any sample size n, when there are superfluous variables in the linear regression model and the design matrix is orthogonal, the probability of the procedures correctly identifying the true set of important variables is less than a constant (smaller than one) not depending on n. This result is also shown to hold for two dimensional problems with gen-eral correlated design matrices. The results indicate that in problems where
Keywords
Affiliated Institutions
Related Publications
Least angle regression
The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to whi...
Variable selection – A review and recommendations for the practicing statistician
Abstract Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk...
Model Selection and Estimation in Regression with Grouped Variables
Summary We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with...
The Adaptive Lasso and Its Oracle Properties
The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this w...
The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error
Abstract When a regression problem contains many predictor variables, it is rarely wise to try to fit the data by means of a least squares regression on all of the predictor var...
Publication Info
- Year
- 2006
- Type
- article
- Citations
- 255
- Access
- Closed