WhySoPessimistic? EstimatingUncertaintiesforOffline RLthroughEnsembles,and WhyTheirIndependenceMatters

Neural Information Processing Systems 

Through theoretical analyses andconstruction ofexamples intoyMDPs,wedemonstrate thatshared pessimistic targets can paradoxically lead to value estimates that are effectively optimistic.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found