Instance-optimal PAC Algorithms for Contextual Bandits

Neural Information Processing Systems 

We consider the stochastic contextual bandit problem in the P AC setting.