DeepMind papers at NIPS 2017 DeepMind
Learning in models with discrete latent variables is challenging due to high-variance gradient estimators. Previous approaches either produced high-variance, unbiased gradients or low-variance, biased gradients. REBAR uses control variates and the reparameterization trick to get the best of both: low-variance, unbiased gradients that result in faster convergence to a better result. "We describe a new family of approaches for imagination-based planning...We also introduce architectures which provide new ways for agents to learn and construct plans to maximise the efficiency of a task. These architectures are efficient, robust to complex and imperfect models, and can adopt flexible strategies for exploiting their imagination. The agents we introduce benefit from an'imagination encoder'- a neural network which learns to extract any information useful for the agent's future decisions, but ignore that which is not relevant."
Apr-28-2018, 21:16:04 GMT
- Technology: