An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Oct-8-2024, 09:42:38 GMT–Neural Information Processing Systems

However, this dependence is through entropy, which can become arbitrarily large as the number of actions increases. We establish new bounds that depend instead on a notion of rate-distortion. Among other things, this allows us to recover through information-theoretic arguments a near-optimal bound for the linear bandit. We also offer a bound for the logistic bandit that dramatically improves on the best previously available, though this bound depends on an information-theoretic statistic that we have only been able to quantify via computation.

artificial intelligence, machine learning, thompson, (16 more...)

Neural Information Processing Systems

Oct-8-2024, 09:42:38 GMT

Conferences PDF

Add feedback

Country:
- North America (0.28)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
An Information-Theoretic Analysis for Thompson Sampling with Many Actions
An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Similar Docs Excel Report more

Title	Similarity	Source
None found