Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
–Neural Information Processing Systems
The information-theoretic analysis by Russo and Van Roy [25] in combination with minimax duality has proved a powerful tool for the analysis of online learning algorithms in full and partial information settings. In most applications there is a tantalising similarity to the classical analysis based on mirror descent. We make a formal connection, showing that the information-theoretic bounds in most applications can be derived from existing techniques for online convex optimisation.
Neural Information Processing Systems
Jan-25-2025, 18:04:07 GMT