Adaptive Representation Selection in Contextual Bandit with Unlabeled History

Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Rish, Irina

Feb-3-2018–arXiv.org Machine Learning

We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit.

bandit, big data, health & medicine, (18 more...)

arXiv.org Machine Learning

Feb-3-2018

arXiv.org PDF

Add feedback

Country:
- Europe
  - France (0.14)
  - United Kingdom > Scotland (0.14)
- North America > United States (0.14)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (0.94)
  - Data Science > Data Mining
    - Big Data (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found