Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
–Neural Information Processing Systems
We present a theoretical result demonstrating the strong dependency of suboptimality on the number of Monte Carlo samples taken per Bellman target calculation.
Neural Information Processing Systems
Oct-10-2025, 07:44:20 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > Denmark
- Southern Denmark (0.04)
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report