as the reviewers were happy with the motivation and practicality of our work. believe that our novelty is in proposing Thompson sampling latent bandit algorithms using offline-learned graphical

Neural Information Processing Systems 

We would like to thank the reviewers for their insightful reviews. The primary weakness that several reviewers brought up was that the methods and analysis were straightforward. Reviewer #1 "algorithm... is not designed keeping short horizons in mind": Our algorithms quickly personalize by Reviewer #2 "suffers an exploration-exploitation tradeoff": You are correct in noting that our algorithm depends on "unified analyses cannot cover instance-dependent bounds": We derive Bayes regret bounds, which contain an expectation Reviewer #3 Thank you for your detailed corrections! We will update the paper with your clarifications. "the available epsilon-bounds are wildly pessimistic": You are correct in noting that our regret bounds require that the Updating our regret bounds to reflect this is a future line of work.