To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation

May-26-2020–arXiv.org Machine Learning

Contextual bandits provide a natural framework to model a lot of practical sequential decision making problems in various fields. Woodroofe (1979) started studying multiarmed bandit problems with side information in a parametric framework, and Yang and Zhu (2002) initiated an investigation from a nonparametric perspective. See Lai (2001);Bartroff et al. (2008) for reviews on general sequential problems and Bubeck and Cesa-Bianchi (2012) for bandits exclusively. In recent years, bandit problems have gained popularity and have been studied extensively under different names, such as contextual bandits, multi-armed bandits with covariates (MABC), associative bandit problems and multi-armed bandits with side information. For example, when treating patients of a disease, the doctor needs to decide which treatment amongst several competing treatments would be the best for the current patient, given the patient's covariate information and data available from previous patients. Most of the bandit algorithms assume instantaneous observance of rewards, but in most practical situations, rewards are only obtained at some delayed time. For example, it is often the case that several other patients have to be treated before the outcome for the current patient is observed. One way to tackle this problem is to adopt black-box procedures incorporating delayed rewards using the already existing no-delay policies in the stochastic bandits setting.

data mining, machine learning, mean reward function, (18 more...)

arXiv.org Machine Learning

May-26-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Minnesota (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found