Algorithms for Infinitely Many-Armed Bandits

Wang, Yizao, Audibert, Jean-yves, Munos, Rémi

Dec-31-2009–Neural Information Processing Systems

We consider multi-armed bandit problems where the number of arms is larger than the possible number of experiments. We make a stochastic assumption on the mean-reward of a new selected arm which characterizes its probability of being anear-optimal arm. Our assumption is weaker than in previous works. We describe algorithms based on upper-confidence-bounds applied to a restricted set of randomly selected arms and provide upper-bounds on the resulting expected regret. We also derive a lower-bound which matches (up to a logarithmic factor) the upper-bound in some cases.

algorithm, artificial intelligence, big data, (18 more...)

Neural Information Processing Systems

Dec-31-2009

Conferences PDF

Add feedback

Country:
- Europe > France (0.28)
- North America > United States
  - Michigan (0.14)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (0.68)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found