CoinDICE: Off-Policy Confidence Interval Estimation

Oct-3-2025, 03:57:32 GMT–Neural Information Processing Systems

One of the major barriers that hinders the application of reinforcement learning (RL) is the ability to evaluate new policies reliably before deployment, a problem generally known as off-policy evaluation (OPE).

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Oct-3-2025, 03:57:32 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States (0.28)
  - Canada (0.28)

Genre:
- Research Report (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Optimization (0.94)