Reinforcement Learning
Goal-conditioned Imitation Learning
Yiming Ding, Carlos Florensa, Pieter Abbeel, Mariano Phielipp
Furthermore, we are often interested in being able to reach a wide range of configurations, hence setting up a different reward every time might be unpractical. Methods like Hindsight Experience Replay (HER) have recently shown promise to learn policies able to reach many goals, without the need of a reward.