Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning

Open in new window