Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Open in new window