Reviews: Worst-Case Regret Bounds for Exploration via Randomized Value Functions

Neural Information Processing Systems 

Post author response: I thank the author(s) for their response and commenting on my discussion points. As those would need additional work, I for now keep my original score: this is a solid paper. While the proof for Lemma 4 & 5 is described very well in the main text, it would be helpful to have a short explanation how this is used to achieve Lemma 6. If necessary, I suggest to drop the proof of Lemma 3 from the main text as this result is standard. Quality: I have verified the proof in the main text and individual lemmas in the appendix.