Reviews: Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Neural Information Processing Systems 

The reviewers liked this paper, and I did as well. One thought is whether or not Exp4 can be adapted to this setting. The translation is not immediate by any means, but perhaps this is worth thinking about. Please take the reviewers suggestions into consideration for the final version as promised in your response.