WhySoPessimistic? EstimatingUncertaintiesforOffline RLthroughEnsembles,and WhyTheirIndependenceMatters
–Neural Information Processing Systems
Through theoretical analyses andconstruction ofexamples intoyMDPs,wedemonstrate thatshared pessimistic targets can paradoxically lead to value estimates that are effectively optimistic.
Neural Information Processing Systems
Feb-9-2026, 20:00:00 GMT