Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

Neural Information Processing Systems 

In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found