Reviews: Near-Optimal Reinforcement Learning in Dynamic Treatment Regimes

Jan-25-2025, 04:37:25 GMT–Neural Information Processing Systems

In this paper, the authors provide a method for incorporating observational data (possibly subject to unobserved confounding) to improve the performance of policy learning in online settings (crucial theorems are 5,7 and 8). After a period of discussion, the reviewers came to a consensus that this paper merits publication in NeurIPS, and will contribute to the RL literature by giving a principled method of incorporating observational data, even if confounded.

dynamic treatment regime, near-optimal reinforcement learning, observational data

Neural Information Processing Systems

Jan-25-2025, 04:37:25 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)