Reviews: The Importance of Sampling inMeta-Reinforcement Learning

Neural Information Processing Systems 

The paper shows the importance of the used training setup for MAML and RL 2. A setup can include "exploratory episodes" and measure the loss only on the next "reporting" episodes. The paper presents interesting results. The introduced E-MAML and E-RL 2 variants clearly help. The main problem with the paper: The paper does not define well the objective. I only deduced from the Appendix C that the setup is: After starting in a new environment, do 3 exploratory episodes and report the collected reward on the next 2 episodes.