Exploration by Optimisation in Partial Monitoring

Jul-12-2019–arXiv.org Machine Learning

We provide a simple and efficient algorithm for adversarial $k$-action $d$-outcome non-degenerate locally observable partial monitoring games for which the $n$-round minimax regret is bounded by $3(d+1) k^{3/2} \sqrt{8n \log(k)}$, matching the best known information-theoretic upper bounds.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

Jul-12-2019

arXiv.org PDF

Add feedback

Country:
- Europe
  - France > Île-de-France
    - Paris > Paris (0.04)
  - Spain > Canary Islands (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- North America > United States
  - California > Los Angeles County
    - Long Beach (0.04)
  - Illinois > Cook County
    - Chicago (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found