PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Marc Peter Deisenroth; Carl Edward Rasmussen

Abstract

In this paper, we introduce pilco, a practical, data-efficient model-based policy search method. Pilco reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, pilco can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-ofthe-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. 1. Introduction and Related

Keywords

Computer scienceReinforcement learningInferenceProbabilistic logicKey (lock)Artificial intelligenceMachine learningPolicy learningScratchData modelingControl (management)Approximate inference

Affiliated Institutions

Related Publications

A Survey on Policy Search for Robotics

Marc Peter Deisenroth , Gerhard Neumann , Jan Peters

<p>Policy search is a subfield in reinforcement learning which focuses onfinding good parameters for a given policy parametrization. It is wellsuited for robotics as it ca...

2011 Foundations and Trends in Robotics 679 citations

Progressive Neural Architecture Search

Chenxi Liu , Barret Zoph , Maxim Neumann +7 more

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement lea...

2018 Lecture notes in computer science 1934 citations

<i>Stan</i>: A Probabilistic Programming Language

Bob Carpenter , Andrew Gelman , Matthew D. Hoffman +7 more

Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on spe...

2017 Journal of Statistical Software 6877 citations

Model-Based Clustering, Discriminant Analysis, and Density Estimation

Chris Fraley , Adrian E. Raftery

Cluster analysis is the automated search for groups of related observations in a dataset. Most clustering done in practice is based largely on heuristic but intuitively reasonab...

2002 Journal of the American Statistical A... 4130 citations

Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

Marcus A. Brubaker , Andreas Geiger , Raquel Urtasun

In this paper we propose an affordable solution to self-localization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilistic mo...

2013 146 citations

Publication Info

Year: 2011
Type: article
Pages: 465-472
Citations: 1076
Access: Closed

External Links

Citation Metrics

1076

OpenAlex

Cite This

APA Style

                            
                                    Marc Peter Deisenroth, 
                                
                                    Carl Edward Rasmussen
                                
                            (2011). 
                            PILCO: A Model-Based and Data-Efficient Approach to Policy Search. 
                            Scientific Repository (Petra Christian University)
                            
                            , 465-472.