Reviews: Information-Theoretic Confidence Bounds for Reinforcement Learning
–Neural Information Processing Systems
The paper extends Russo and Van Roy (JMRL2016) work to provide information-theoretical analysis of Thompson sampling and UCB-like algorithms in more general setting. The three reviewers acknowledge the contributions, and the potential impact of connecting information-theoretical concepts to the design of algorithms. Reviewers have suggested ways to improve the manuscript. The authors should follow these directions, and in particular fix notations, include simulation results, and provide explanations about proofs when necessary. The contributions in this paper are "methodological", i.e., it proposes a framework to analyze the regret of certain algorithms.
Neural Information Processing Systems
Jan-23-2025, 05:45:38 GMT
- Technology: