Review for NeurIPS paper: Value-driven Hindsight Modelling

Jan-26-2025, 15:23:02 GMT–Neural Information Processing Systems

Learning value functions is a central theme in reinforcement learning. It is a hard problem because of the non-stationary nature of bootstrapping. This paper proposes a fresh approach for improving the learning of value functions by conditioning them on some information of the future states at training time (hindsight). Conditioning on the right future data should provide more certainty about the future return. All the reviewers liked the premise of the paper, clear motivation, and thorough experiments.

neurips paper, value function, value-driven hindsight modelling, (2 more...)

Neural Information Processing Systems

Jan-26-2025, 15:23:02 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)