VIME: Variational Information Maximizing Exploration, Yan Duan