Abstract
In this paper, we introduce pilco, a practical, data-efficient model-based policy search method. Pilco reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, pilco can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-ofthe-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. 1. Introduction and Related
Keywords
Affiliated Institutions
Related Publications
A Survey on Policy Search for Robotics
<p>Policy search is a subfield in reinforcement learning which focuses onfinding good parameters for a given policy parametrization. It is wellsuited for robotics as it ca...
Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement lea...
<i>Stan</i>: A Probabilistic Programming Language
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on spe...
Model-Based Clustering, Discriminant Analysis, and Density Estimation
Cluster analysis is the automated search for groups of related observations in a dataset. Most clustering done in practice is based largely on heuristic but intuitively reasonab...
Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
In this paper we propose an affordable solution to self-localization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilistic mo...
Publication Info
- Year
- 2011
- Type
- article
- Pages
- 465-472
- Citations
- 1076
- Access
- Closed