Multi-Player Bandits: The Adversarial Case

Alatur, Pragnya, Levy, Kfir Y., Krause, Andreas

arXiv.org Machine Learning 

The Multi Armed Bandit (MAB) problem is a fundamental setting for capturing and analyzing sequentialdecision making. Since the seminal work of Robbins (1952) there has been a plethora of research on this topic (Cesa-Bianchi & Lugosi, 2006; Bubeck & Cesa-Bianchi, 2012; Lattimore & Szepesvári, 2018), addressing both the stochastic and adversarial MAB settings. In the stochastic setting it is assumed that the environment is stationary, namely that except for noisy fluctuations, the environment does not change over time. The adversarial setting is more general, and enables to capture dynamical (arbitrarily changing)environments. Most existing work on MABs considers a single player who sequentially interacts with the environment.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found