Reviews: Regret Bounds for Learning State Representations in Reinforcement Learning

Jan-25-2025, 23:58:55 GMT–Neural Information Processing Systems

The authors present a regret analysis for learning state representation. They propose an algorithm called UCB-MS with O(\sqrt{T}) regret, which improves over the currently best result. The paper is well-organized and easy to follow. The authors also explain the possible methods and directions to further improve the bound. The paper could be more clear if lemma 3 was proved in appendix.

learning state representation, regret bound, reinforcement learning, (9 more...)

Neural Information Processing Systems

Jan-25-2025, 23:58:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)