Optimal Dynamic Regret in LQR Control
–Neural Information Processing Systems
We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. The rate improves the best known rate of \tilde{O}(\sqrt{n (\mathcal{TV}(M_{1:n}) 1)}) for general convex losses and is information-theoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster & Simchowitz 2020, as well as a new \emph{proper} learning algorithm with an optimal \tilde{O}(n {1/3}) dynamic regret on a family of "minibatched'' quadratic losses, which could be of independent interest.
Neural Information Processing Systems
Jan-18-2025, 05:40:10 GMT
- Technology: