Greedy Bandits with Sampled Context

Jul-27-2020–arXiv.org Artificial Intelligence

Bayesian strategies for contextual bandits have proved promising in single-state reinforcement learning tasks by modeling uncertainty using context information from the environment. In this paper, we propose Greedy Bandits with Sampled Context (GB-SC), a method for contextual multi-armed bandits to develop the prior from the context information using Thompson Sampling, and arm selection using an epsilon-greedy policy. The framework GB-SC allows for evaluation of context-reward dependency, as well as providing robustness for partially observable context vectors by leveraging the prior developed. Our experimental results show competitive performance on the Mushroom environment in terms of expected regret and expected cumulative regret, as well as insights on how each context subset affects decision-making.

data mining, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Jul-27-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Virginia (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Florida > Broward County
    - Fort Lauderdale (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.70)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.50)
  - Artificial Intelligence > Machine Learning
    - Reinforcement Learning (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found