Off-Policy Evaluation for Human Feedback Qitong Gao Ge Gao
–Neural Information Processing Systems
Off-policy evaluation (OPE) is important for closing the gap between offline training and evaluation of reinforcement learning (RL), by estimating performance and/or rank of target (evaluation) policies using offline trajectories only.
Neural Information Processing Systems
Oct-8-2025, 05:56:36 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > France (0.04)
- North America > United States
- North Carolina
- Durham County > Durham (0.04)
- Wake County > Raleigh (0.04)
- North Carolina
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.67)
- Research Report
- Industry:
- Education > Educational Technology
- Health & Medicine
- Pharmaceuticals & Biotechnology (0.93)
- Surgery (0.69)
- Therapeutic Area
- Endocrinology > Diabetes (0.46)
- Neurology > Parkinson's Disease (0.46)
- Oncology (0.67)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.93)
- Reinforcement Learning (0.89)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Robots (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence