Understanding Deep Neural Function Approximation in Reinforcement Learning via ϵ-Greedy Exploration

Neural Information Processing Systems 

This problem setting is motivated by the successful deep Q-networks (DQN) framework that falls in this regime.