Gradient Ascent for Active Exploration in Bandit Problems

May-20-2019–arXiv.org Machine Learning

We present a new algorithm based on an gradient ascent for a general Active Exploration bandit problem in the fixed confidence setting. This problem encompasses several well studied problems such that the Best Arm Identification or Thresholding Bandits. It consists of a new sampling rule based on an online lazy mirror ascent. We prove that this algorithm is asymptotically optimal and, most importantly, computationally efficient.

algorithm, artificial intelligence, big data, (20 more...)

arXiv.org Machine Learning

May-20-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > New York (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (0.88)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found