Keywords
Related Publications
Playing Atari with Deep Reinforcement Learning
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolu...
Competing in the dark: An efficient algorithm for bandit linear optimization
We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O*(√T)regret. The setting is a natural general...
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing ...
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative grad...
Publication Info
- Year
- 2011
- Type
- article
- Volume
- 61
- Issue
- 3
- Pages
- 203-230
- Citations
- 201
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1007/s10472-011-9258-6