Review for NeurIPS paper: DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Feb-6-2025, 20:44:52 GMT–Neural Information Processing Systems

The paper is very theoretically-grounded, with plenty of explanation of intuition and proof of the approximations used. The significance of the contribution is large. Most RL algorithms are exactly the ADP family that this proposes to modify, and the addition of this corrective feedback model can be slotted into most training loops without compatibility issues. As the authors note, it could also be used to guide exploration rather than just for post hoc transition correction. This is clearly relevant to the NeurIPS community, much of which makes use of this form of RL algorithm.

corrective feedback, distribution correction, reinforcement learning, (7 more...)

Neural Information Processing Systems

Feb-6-2025, 20:44:52 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)