Minimal Exploration in Structured Stochastic Bandits

Richard Combes, Stefan Magureanu, Alexandre Proutiere

Neural Information Processing Systems 

This paper introduces and addresses a wide class of stochastic bandit problems where the function mapping the arm to the corresponding reward exhibits some known structural properties.