Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
–Neural Information Processing Systems
Off-policy evaluation of sequential decision policies from observational data is necessary in applications of batch reinforcement learning such as education and healthcare.
Neural Information Processing Systems
Nov-15-2025, 17:51:27 GMT
- Country:
- North America
- Canada (0.04)
- United States > New Jersey
- Mercer County > Princeton (0.04)
- North America
- Genre:
- Research Report (0.68)
- Industry:
- Health & Medicine (0.66)
- Technology: