Adaptive Regret for Control of Time-Varying Dynamics

Gradu, Paula, Hazan, Elad, Minasyan, Edgar

Jul-16-2020–arXiv.org Machine Learning

We consider regret minimization for online control with time-varying linear dynamical systems. The metric of performance we study is adaptive policy regret, or regret compared to the best policy on {\it any interval in time}. We give an efficient algorithm that attains first-order adaptive regret guarantees for the setting of online convex optimization with memory. We also show that these first-order bounds are nearly tight. This algorithm is then used to derive a controller with adaptive regret guarantees that provably competes with the best linear dynamical controller on any interval in time. We validate these theoretical findings experimentally on (1) simulations of time-varying linear dynamics and disturbances, and (2) the non-linear inverted pendulum benchmark.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

Jul-16-2020

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)
- North America > United States
  - California (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found