Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems

Jan-14-2025, 22:23:40 GMT–Neural Information Processing Systems

We study the problem of system identification and adaptive control in partially observable linear dynamical systems. Adaptive and closed-loop system identification is a challenging problem due to correlations introduced in data collection. In this paper, we present the first model estimation method with finite-time guarantees in both open and closed-loop system identification. Deploying this estimation method, we propose adaptive control online learning (AdapOn), an efficient reinforcement learning algorithm that adaptively learns the system dynamics and continuously updates its controller through online learning steps. AdapOn estimates the model dynamics by occasionally solving a linear regression problem through interactions with the environment.

logarithmic regret bound, observable linear dynamical system, system identification, (9 more...)

Neural Information Processing Systems

Jan-14-2025, 22:23:40 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.62)
  - Statistical Learning (0.61)