Recovering from Out-of-sample States via Inverse Dynamics in Offline Reinforcement Learning

Neural Information Processing Systems 

However, such pessimism for out-of-sample data could be too restricted and sample inefficient, as not all out-of-sample(unseen) states are not generalizable [20].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found