Review for NeurIPS paper: A Unifying View of Optimism in Episodic Reinforcement Learning

Jan-21-2025, 14:34:51 GMT–Neural Information Processing Systems

Weaknesses: While I like the duality result, I find this paper is not substantial enough that merits acceptance. This paper shows a class of model-optimistic algorithms can be implemented efficiently (with minor modifications). However, none of state-of-the-art algorithms is model-optimistic algorithms. This is somehow inherent with this class of algorithms because the transition model scales with S 2 but the optimal bounds scale linearly in S via value-optimistic algorithms. Value-optimisic algorithms are not only more computationally efficient but also more statistically efficient. So making model-optimistic algorithms more efficient is not a very significant result.

algorithm, episodic reinforcement learning, reinforcement learning, (5 more...)

Neural Information Processing Systems

Jan-21-2025, 14:34:51 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)