Certainty Equivalent Control of LQR is Efficient

Mania, Horia, Tu, Stephen, Recht, Benjamin

arXiv.org Machine Learning 

One of the most straightforward methods for controlling a dynamical system with unknown transitions isbased on the certainty equivalence principle: a model of the system is fit by observing its time evolution, and a control policy is then designed by treating the fitted model as the truth [8]. Despite the simplicity of this method, it is challenging to guarantee its efficiency because small modeling errors may propagate to large, undesirable behaviors on long time horizons. As a result, most work on controlling systems with unknown dynamics has explicitly incorporated robustness against model uncertainty [11, 12, 20, 25, 35, 36]. In this work, we show that for the standard baseline of controlling an unknown linear dynamical system with a quadratic objective function, known as the Linear Quadratic Regulator (LQR), certainty equivalent control synthesis achieves better cost than prior methods that account for model uncertainty. In the case of offline control, where one collects some data and then designs a fixed control policy to be run on an infinite time horizon, we show that the gap between the performance of the certainty equivalent controller and the optimal control policy scales quadratically with the error in the model parameters.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found