ES-MAML: Simple Hessian-Free Meta Learning

Song, Xingyou, Gao, Wenbo, Yang, Yuxiang, Choromanski, Krzysztof, Pacchiano, Aldo, Tang, Yunhao

arXiv.org Artificial Intelligence 

Meta-learning is a paradigm in machine learning which aims to develop models and training algorithms which can quickly adapt to new tasks and data. Our focus in this paper is on meta-learning in reinforcement learning (RL), where data efficiency is of paramount importance because gathering new samples often requires costly simulations or interactions with the real world. A popular technique for RL meta-learning is Model Agnostic Meta Learning (MAML) (Finn et al., 2017, 2018), a model for training an agent (the meta-policy) which can quickly adapt to new and unknown tasks by performing one (or a few) gradient updates in the new environment. We provide a formal description of MAML in Section 2. MAML has proven to be successful for many applications. However, implementing and running MAML continues to be challenging. One major complication is that the standard version of MAML requires estimating second derivatives of the RL reward function, which is difficult when using backpropagation on stochastic policies; indeed, the original implementation of MAML (Finn et al., 2017) did so incorrectly, which spurred the development of unbiased higher-order estimators (DiCE, (Foerster et al., 2018)) and further analysis of the credit assignment mechanism in MAML (Rothfuss et al., 2019).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found