Generalization and Scaling in Reinforcement Learning

Abstract

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

Keywords

Reinforcement learningGeneralizationComputer scienceArtificial intelligenceReinforcementAssociative propertyFunction (biology)ExploitLearning classifier systemFunction approximationArtificial neural networkTask (project management)Machine learningMathematicsEngineering

Related Publications

Playing Atari with Deep Reinforcement Learning

Volodymyr Mnih , Koray Kavukcuoglu , David Silver +4 more

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolu...

2013 arXiv (Cornell University) 5109 citations

Generalization in Reinforcement Learning: Safely Approximating the Value Function

Justin A. Boyan , Andrew Moore

A straightforward approach to the curse of dimensionality inreinforcement learning and dynamic programming is to replace the lookup table with a generalizing function approximat...

1994 506 citations

Function Optimization using Connectionist Reinforcement Learning Algorithms

Ronald J. Williams , Jing Peng

Any non-associative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function value...

1991 Connection Science 296 citations

On the use of backpropagation in associative reinforcement learning

Williams

A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second ...

1988 IEEE International Conference on Neur... 74 citations

Input generalization in delayed reinforcement learning: an algorithm and performance comparisons

David Chapman , Leslie Pack Kaelbling

Delayed reinforcement learning is an attractive framework for the unsupervised learning of action policies for autonomous agents. Some existing delayed reinforcement learning te...

1991 249 citations

Publication Info

Year: 1989
Type: article
Volume: 2
Pages: 550-557
Citations: 54
Access: Closed

External Links

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    David H. Ackley, 
                                
                                    Michael L. Littman
                                
                            (1989). 
                            Generalization and Scaling in Reinforcement Learning. 
                            
                            , 2
                            
                            , 550-557.