Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

Zhang, Jingwei, Springenberg, Jost Tobias, Byravan, Arunkumar, Hasenclever, Leonard, Abdolmaleki, Abbas, Rao, Dushyant, Heess, Nicolas, Riedmiller, Martin

arXiv.org Artificial Intelligence 

From daily interactions with the world, humans gradually develop an internal understanding of which series of events would be triggered when a certain sequence of actions is taken (Hogendoorn and Burkitt, 2018; Maus et al., 2013; Nortmann et al., 2015). This mental model of the world can serve as a compact proxy of our previous experiences and help us plan out routes to desired goals before taking action (Ha and Schmidhuber, 2018). Studies have further implied that these mental predictive models might not be restricted to the level of primitive actions (Botvinick, 2008; Consul et al., 2022), but rather consider predictions over larger timescales that abstract away detailed behavior consequences, which can enable efficient long-horizon planning to guide our daily decision making. When developing intelligent artificial agents it is therefore natural to imagine a similar process being useful for learning and transferring abstract models of the world across streams of experiences and tasks. We expect such a temporally abstract model of actions and dynamics to be significantly more useful than a simple one-step prediction model (together with primitive policies) when transferring them to a target task. This is because they should allow us to rapidly plan over long trajectories (to find some states with high rewards) while alleviating the common problem of error accumulation that occurs when chaining one-step prediction models which limits the effective planning horizon in most existing methods, e.g.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found