Bandits with adversarial scaling

Lykouris, Thodoris, Mirrokni, Vahab, Leme, Renato Paes

Mar-4-2020–arXiv.org Machine Learning

We study "adversarial scaling", a multi-armed bandit model where rewards have a stochastic and an adversarial component. Our model captures display advertising where the "click-through-rate" can be decomposed to a (fixed across time) arm-quality component and a non-stochastic user-relevance component (fixed across arms). Despite the relative stochasticity of our model, we demonstrate two settings where most bandit algorithms suffer. On the positive side, we show that two algorithms, one from the action elimination and one from the mirror descent family are adaptive enough to be robust to adversarial scaling. Our results shed light on the robustness of adaptive parameter selection in stochastic bandits, which may be of independent interest.

adversarial scaling, algorithm, bandit, (14 more...)

arXiv.org Machine Learning

Mar-4-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Asia > China (0.04)

Genre:
- Research Report (0.70)

Industry:
- Education (0.46)
- Marketing (0.34)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found