Planning with Goal-Conditioned Policies

Nasiriany, Soroush, Pong, Vitchyr, Lin, Steven, Levine, Sergey

Mar-19-2020, 02:47:03 GMT–Neural Information Processing Systems

Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors. However, planning requires suitable abstractions for the states and transitions, which typically need to be designed by hand. In contrast, reinforcement learning (RL) can acquire behaviors from low-level inputs directly, but struggles with temporally extended tasks. Can we utilize reinforcement learning to automatically form the abstractions needed for planning, thus obtaining the best of both approaches? We show that goal-conditioned policies learned with RL can be incorporated into planning, such that a planner can focus on which states to reach, rather than how those states are reached.

abstraction, goal-conditioned policy, latent variable model, (3 more...)

Neural Information Processing Systems

Mar-19-2020, 02:47:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)