Optimal Dynamic Regret in LQR Control

Jan-18-2025, 05:40:10 GMT–Neural Information Processing Systems

We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. The rate improves the best known rate of \tilde{O}(\sqrt{n (\mathcal{TV}(M_{1:n}) 1)}) for general convex losses and is information-theoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster & Simchowitz 2020, as well as a new \emph{proper} learning algorithm with an optimal \tilde{O}(n {1/3}) dynamic regret on a family of "minibatched'' quadratic losses, which could be of independent interest.

artificial intelligence, machine learning, optimal dynamic regret, (7 more...)

Neural Information Processing Systems

Jan-18-2025, 05:40:10 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.84)