Goal-conditioned Imitation Learning

Yiming Ding, Carlos Florensa, Pieter Abbeel, Mariano Phielipp

Neural Information Processing Systems 

Furthermore, we are often interested in being able to reach a wide range of configurations, hence setting up a different reward every time might be unpractical. Methods like Hindsight Experience Replay (HER) have recently shown promise to learn policies able to reach many goals, without the need of a reward.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found