Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

Neural Information Processing Systems 

We provide a tight regret analysis for ExpTS, which simultaneously yields both the finite-time regret bound as well as the asymptotic regret bound.