On the Design of LQR Kernels for Efficient Controller Learning

Marco, Alonso, Hennig, Philipp, Schaal, Stefan, Trimpe, Sebastian

arXiv.org Machine Learning 

A core problem of learning control is to determine optimal feedback controllers for (partially) unknown nonlinear systems from experimental data. Reinforcement learning (RL) [1], [2] is a promising framework for this, yet often requires performing many experiments on the physical system to even find suitable controllers, which limits the applicability of such techniques. Therefore, a lot of research effort has been invested into data efficiency of RL aiming at learning controllers from as few experiments as possible. Recently, Bayesian optimization (BO) has been proposed for RL as a promising approach in this direction. BO employs a probabilistic description of the latent objective function (typically a Gaussian process (GP)), which allows for selecting next control experiments in a principled manner, e.g., to maximize information gain [3] or perform safe exploration [4]. While BO provides a promising framework for learning controllers in fairly general settings, the full power of Bayesian learning is often not exploited. A key advantage of Bayesian methods is that they allow for combining prior problem knowledge with learning from data in a principled manner. In case of GP models, this concerns specifically the choice of the kernel, which captures the covariance between function values at different inputs and is thus the core component to model prior knowledge about the function shape. By choosing standard kernels, however, naive BO approaches do often not exploit this opportunity to improve learning performance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found