Bellman-consistent Pessimism for Offline Reinforcement Learning
–Neural Information Processing Systems
The use of pessimism, when reasoning about datasets lacking exhaustive exploration, has recently gained prominence in offline reinforcement learning.
Neural Information Processing Systems
Aug-14-2025, 03:38:03 GMT