propose the first finite-time system identification algorithm for partially observable linear dynamical systems (LDS)

Neural Information Processing Systems 

We thank the reviewers for their effort and insightful comments during these unprecedented times. LQR & LQG are among few continuous settings where the optimal policies exist (and mainly have closed form) [1]. Therefore, we do not see why this paper would be less relevant to our community. If PE is absent, we provide two general algorithms stated in Cor. The agent uses a warm-up period of O ( T) after which it commits to a controller yielding a regret of T .

Similar Docs  Excel Report  more

TitleSimilaritySource
None found