Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
–Neural Information Processing Systems
Reliable uncertainty quantification is crucial for reinforcement learning (RL) in high-stakes settings. We propose a unified conformal prediction framework for infinite-horizon policy evaluation that constructs distribution-free prediction intervals for returns in both on-policy and off-policy settings.
Neural Information Processing Systems
Jun-14-2026, 04:37:45 GMT
- Technology: