Abstract

Any non-associative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. We describe the results of simulations in which the optima of several deterministic functions studied by Ackley were sought using variants of REINFORCE algorithms. Some of the algorithms used here incorporated additional heuristic features resembling certain aspects of some of the algorithms used in Ackley's studies. Differing levels of performance were achieved by the various algorithms investigated, but a number of them performed at a level comparable to the best found in Ackley's studies on a number of the tasks, in spite of their simplicity. One of these variants, called REINFORCE/MENT, represents a novel but principled approach to reinforcement learning in nontrivial networks which incorporates an entropy maximization strategy. This was found to perform especially well on more hierarchically organized tasks.

Keywords

Reinforcement learningComputer scienceConnectionismArtificial intelligenceAlgorithmAssociative propertyHeuristicMaximizationEntropy (arrow of time)Function (biology)Artificial neural networkMachine learningMathematical optimizationMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
1991
Type
article
Volume
3
Issue
3
Pages
241-268
Citations
296
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

296
OpenAlex

Cite This

Ronald J. Williams, Jing Peng (1991). Function Optimization using Connectionist Reinforcement Learning Algorithms. Connection Science , 3 (3) , 241-268. https://doi.org/10.1080/09540099108946587

Identifiers

DOI
10.1080/09540099108946587