Reviews: Regret Bounds for Learning State Representations in Reinforcement Learning

Jan-25-2025, 23:58:45 GMT–Neural Information Processing Systems

This paper proposes a natural extension of UCRL2 to learning state representations. The proposed algorithm chooses optimistically over a finite set of candidate MDPs and their corresponding policies. The algorithm is analyzed and improves over existing regret bounds. The paper was discussed and all reviewers agree that this is a natural extension of UCRL2 that deserves to be published.

learning state representation, regret bound, reinforcement learning, (2 more...)

Neural Information Processing Systems

Jan-25-2025, 23:58:45 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)