The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information

Feb-16-2024, 13:37:47 GMT–Neural Information Processing Systems

Epoch-Greedy has the following properties: No knowledge of a time horizon T is necessary. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. Here S is the complexity term in a sample complexity bound for standard supervised learning.

epoch-greedy algorithm, multi-armed bandit, side information, (1 more...)

Neural Information Processing Systems

Feb-16-2024, 13:37:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.51)
  - Artificial Intelligence
    - Machine Learning (0.55)
    - Representation & Reasoning > Search (0.40)