Asynchronous Methods for Deep Reinforcement Learning

Abstract

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

Keywords

Asynchronous communicationReinforcement learningComputer scienceAsynchronous learningArtificial neural networkDomain (mathematical analysis)Artificial intelligenceTask (project management)Deep learningVariety (cybernetics)Stochastic gradient descentGradient descentEngineeringComputer network

Related Publications

Highway Networks

Rupesh K. Srivastava , Klaus Greff , Jürgen Schmidhuber

There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult w...

2015 arXiv (Cornell University) 301 citations

Training Very Deep Networks

Rupesh K. Srivastava , Klaus Greff , Jürgen Schmidhuber

Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and tra...

2015 arXiv (Cornell University) 1100 citations

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

Fabian Isensee , Paul F. Jaeger , Simon A. A. Kohl +7 more

Biomedical imaging is a driver of scientific discovery and a core component of medical care and is being stimulated by the field of deep learning. While semantic segmentation al...

2020 Nature Methods 6700 citations

Unsupervised Feature Learning via Non-parametric Instance Discrimination

Zhirong Wu , Yuanjun Xiong , Stella X. Yu +1 more

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...

2018 3435 citations

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

Christian Szegedy , Sergey Ioffe , Vincent Vanhoucke +1 more

Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has b...

2017 Proceedings of the AAAI Conference on... 4483 citations

Publication Info

Year: 2016
Type: preprint
Citations: 1690
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Asynchronous Methods for Deep Reinforcement Learning

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1690

OpenAlex

Cite This

APA Style

                            
                                    Volodymyr Mnih, 
                                
                                    Adrià Puigdomènech Badia, 
                                
                                    Mehdi Mirza
                                
                                et al.
                            
                            (2016). 
                            Asynchronous Methods for Deep Reinforcement Learning. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.1602.01783

Identifiers

DOI: 10.48550/arxiv.1602.01783