Outcome-Driven Reinforcement Learning via Variational Inference Tim G. J. Rudner University of Oxford Vitchyr H. Pong

Open in new window