Reviews: Randomized Prior Functions for Deep Reinforcement Learning

Neural Information Processing Systems 

Summary: This paper studies RL exploration based on uncertainty. First, they compare several previously published RL exploration methods and identifying their drawbacks (including illustrative toy experiments). Then, they extend a particular previous method, bootstrapped DQN [1] (which uses bootstrap uncertainty estimates), through the addition of random prior functions. This extension is motivated from Bayesian linear regression, and transferred to the case of deep non-linear neural networks. Experimental results on the Chain, CartPole swing-up and Montezuma Revenge show improved performance over a previous baseline, the bootstrapped DQN method.