A Unified Principle of Pessimism for Offline Reinforcement Learning under Model Mismatch
–Neural Information Processing Systems
To tackle these issues, we propose a unified principle of pessimism using distribu-tionally robust Markov decision processes.
Neural Information Processing Systems
Oct-9-2025, 18:50:36 GMT
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Arizona > Maricopa County
- Tempe (0.04)
- California > San Mateo County
- Menlo Park (0.04)
- Florida > Orange County
- Orlando (0.14)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York > Erie County
- Buffalo (0.04)
- Arizona > Maricopa County
- Europe > United Kingdom
- Genre:
- Research Report > Experimental Study (0.92)
- Industry:
- Government (0.30)
- Technology: