DoubleCheckYourStateBeforeTrustingIt: Confidence-AwareBidirectionalOfflineModel-Based Imagination

Neural Information Processing Systems 

OfflineRLisdeemed to be promising [16, 14] as online learning requires the agent to continuously interact with the environment, which howevermaybecostly,time-consuming, orevendangerous.