Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

Lin Xiao

Abstract

We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as ℓ1-norm for promoting sparsity. We develop a new online algorithm, the regularized dual averaging (RDA) method, that can explicitly exploit the regularization structure in an online setting. In particular, at each iteration, the learning variables are adjusted by solving a simple optimization problem that involves the running average of all past subgradients of the loss functions and the whole regularization term, not just its subgradient. Computational experiments show that the RDA method can be very effective for sparse online learning withℓ1-regularization. 1

Keywords

Subgradient methodProximal gradient methods for learningMathematical optimizationRegularization (linguistics)Convex optimizationConvex functionComputer scienceStochastic optimizationOptimization problemLipschitz continuityMathematicsProximal Gradient MethodsMinificationRegular polygonArtificial intelligenceConvex analysis

Affiliated Institutions

Microsoft (United States) US

Related Publications

Composite objective mirror descent

John C. Duchi , Shai Shalev‐Shwartz , Yoram Singer +1 more

We present a new method for regularized convex optimization and analyze it under both online and stochastic optimization settings. In addition to unifying previously known first...

2010 252 citations

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

John C. Duchi , Elad Hazan , Yoram Singer

We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative grad...

2010 8609 citations

Optimization for Machine Learning

Suvrit Sra , Sebastian Nowozin , Stephen J. Wright

An up-to-date account of the interplay between optimization and machine learning, accessible to students and researchers in both communities. The interplay between optimization ...

2011 The MIT Press eBooks 882 citations

Efficient projections onto the<i>l</i><sub>1</sub>-ball for learning in high dimensions

John C. Duchi , Shai Shalev‐Shwartz , Yoram Singer +1 more

We describe efficient algorithms for projecting a vector onto the ℓ1-ball. We present two methods for projection. The first performs exact projection in O(n) expected time, wher...

2008 1202 citations

Duality Results for Conic Convex Programming

Z-Q. Luo , J.F. Sturm , Shuzhong Zhang

textabstractThis paper presents a unified study of duality properties for the problem of minimizing a linear function over the intersection of an affine space with a convex cone...

1997 RePub (Erasmus University, Rotterdam) 56 citations

Publication Info

Year: 2009
Type: article
Volume: 11
Issue: 88
Pages: 2116-2124
Citations: 605
Access: Closed

External Links

Citation Metrics

605

OpenAlex

Cite This

APA Style

                            
                                    Lin Xiao
                                
                            (2009). 
                            Dual Averaging Method for Regularized Stochastic Learning and Online Optimization. 
                            
                            , 11
                            (88)
                            , 2116-2124.