A Unified Principle of Pessimism for Offline Reinforcement Learning under Model Mismatch
–Neural Information Processing Systems
To tackle these issues, we propose a unified principle of pessimism using distribu-tionally robust Markov decision processes.
Neural Information Processing Systems
Mar-18-2025, 10:30:10 GMT
- Country:
- North America > United States
- Arizona (0.14)
- Florida > Orange County
- Orlando (0.14)
- North America > United States
- Genre:
- Research Report > Experimental Study (0.92)
- Industry:
- Government (0.30)
- Technology: