Abstract
SUMMARY We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.
Keywords
Affiliated Institutions
Related Publications
Regularization Paths for Generalized Linear Models via Coordinate Descent
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multi- nom...
Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR
Summary Variable selection can be challenging, particularly in situations with a large number of predictors with possibly high correlations, such as gene expression data. In thi...
Simple means to improve the interpretability of regression coefficients
Summary 1. Linear regression models are an important statistical tool in evolutionary and ecological studies. Unfortunately, these models often yield some uninterpretable estima...
Collinearity: a review of methods to deal with it and a simulation study evaluating their performance
Collinearity refers to the non independence of predictor variables, usually in a regression‐type analysis. It is a common feature of any descriptive ecological data set and can ...
Centroids
Abstract The concept of centroid is the multivariate equivalent of the mean. Just like the mean, the centroid of a cloud of points minimizes the sum of the squared distances fro...
Publication Info
- Year
- 1996
- Type
- article
- Volume
- 58
- Issue
- 1
- Pages
- 267-288
- Citations
- 49419
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1111/j.2517-6161.1996.tb02080.x