Abstract

We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O*(√T)regret. The setting is a natural generalization of the nonstochastic multiarmed bandit problem, and the existence of an efficient optimal algorithm has been posed as an open problem in a number of recent papers. We show how the difficulties encountered by previous approaches are overcome by the use of a self-concordant potential function. Our approach presents a novel connection between online learning and interior point methods.

Keywords

RegretGeneralizationComputer scienceMathematical optimizationLinear programmingOnline algorithmAlgorithmConnection (principal bundle)Optimization problemPoint (geometry)Function (biology)MathematicsMachine learning

Affiliated Institutions

Related Publications

Handbook of Genetic Algorithms

This book sets out to explain what genetic algorithms are and how they can be used to solve real-world problems. The first objective is tackled by the editor, Lawrence Davis. Th...

1991 7308 citations

Publication Info

Year
2008
Type
article
Pages
263-274
Citations
192
Access
Closed

External Links

Citation Metrics

192
OpenAlex

Cite This

Jacob Abernethy, Elad Hazan, Alexander Rakhlin (2008). Competing in the dark: An efficient algorithm for bandit linear optimization. ScholarlyCommons (University of Pennsylvania) , 263-274.