Logarithmic Regret for Adversarial Online Control

Feb-29-2020–arXiv.org Machine Learning

Reinforcement learning and control consider the behavior of an agent making decisions in a dynamic environment in order to suffer minimal loss. In light of recent practical breakthroughs in datadriven approaches to continuous RL and control (Lillicrap et al., 2016; Mnih et al., 2015; Silver et al., 2017), there is great interest in applying these techniques in real-world decision making applications. However, to reliably deploy data-driven RL and control in physical systems such as self-driving cars, it is critical to develop principled algorithms with provable safety and robustness guarantees. At the same time, algorithms should not be overly pessimistic, and should be able to take advantage of benign environments whenever possible. In this paper we develop algorithms for online linear-quadratic control which ensure robust worst-case performance while optimally adapting to the environment at hand. Linear control has traditionally been studied in settings where the dynamics of the environment are either governed by a well-behaved stochastic process or driven by a worst-case process to which the learner must remain robust in theH sense. We consider an intermediate approach introduced by Agarwal et al. (2019a) in which disturbances are non-stochastic but performance is evaluated in terms of regret. This benchmark forces the learner's control policy to achieve near optimal performance on any specific disturbance process encountered.

algorithm, logarithmic regret, online, (15 more...)

arXiv.org Machine Learning

Feb-29-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.64)

Industry:
- Transportation (0.54)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Robots > Autonomous Vehicles (0.54)
  - Machine Learning > Reinforcement Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found