Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

May-28-2019–arXiv.org Machine Learning

The information-theoretic analysis by Russo and Van Roy (2014) in combination with minimax duality has proved a powerful tool for the analysis of online learning algorithms in full and partial information settings. In most applications there is a tantalising similarity to the classical analysis based on mirror descent. We make a formal connection, showing that the information-theoretic bounds in most applications can be derived from existing techniques for online convex optimisation. Besides this, for $k$-armed adversarial bandits we provide an efficient algorithm with regret that matches the best information-theoretic upper bound and improve best known regret guarantees for online linear optimisation on $\ell_p$-balls and bandits with graph feedback.

artificial intelligence, machine learning, stab, (17 more...)

arXiv.org Machine Learning

May-28-2019

arXiv.org PDF

Add feedback

Country:
- Europe
  - Spain > Canary Islands (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)

Genre:
- Research Report (0.50)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found