cf708fc1decf0337aded484f8f4519ae-Supplemental.pdf

Neural Information Processing Systems 

We found that training an inverse model is crucial for learning good representations. On the first row,alevel from each environment that one-shot PPGS fails tosolve(thewhitearrowsrepresent thepolicy). Iterative Model Improvement In general settings, collecting training trajectories by sampling actions uniformly atrandom does not grant sufficient coverage ofthe state space. GLAMORGLAMOR [34] learns inverse dynamics to achieve visual goals in Atari games. The only difference withPPGS in terms of settings is that we allowGLAMORto collect data on-policy and for more interactions (2M).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found