Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Neural Information Processing Systems 

Designing efficient exploration is central to Reinforcement Learning due to the fundamental problem posed by the exploration-exploitation dilemma.