Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

Lin Xiao Lin Xiao
2009 605 citations

Abstract

We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as ℓ1-norm for promoting sparsity. We develop a new online algorithm, the regularized dual averaging (RDA) method, that can explicitly exploit the regularization structure in an online setting. In particular, at each iteration, the learning variables are adjusted by solving a simple optimization problem that involves the running average of all past subgradients of the loss functions and the whole regularization term, not just its subgradient. Computational experiments show that the RDA method can be very effective for sparse online learning withℓ1-regularization. 1

Keywords

Subgradient methodProximal gradient methods for learningMathematical optimizationRegularization (linguistics)Convex optimizationConvex functionComputer scienceStochastic optimizationOptimization problemLipschitz continuityMathematicsProximal Gradient MethodsMinificationRegular polygonArtificial intelligenceConvex analysis

Affiliated Institutions

Related Publications

Publication Info

Year
2009
Type
article
Volume
11
Issue
88
Pages
2116-2124
Citations
605
Access
Closed

External Links

Citation Metrics

605
OpenAlex

Cite This

Lin Xiao (2009). Dual Averaging Method for Regularized Stochastic Learning and Online Optimization. , 11 (88) , 2116-2124.