Gamification of Pure Exploration for Linear Bandits
Degenne, Rémy, Ménard, Pierre, Shang, Xuedong, Valko, Michal
We investigate an active pure-exploration setting, that includes best-arm identification, in the context Since the early work of Robbins (1952), a great amount of of linear stochastic bandits. While asymptotically literature explores MAB in their standard stochastic setting optimal algorithms exist for standard multiarm with its numerous extensions and variants. Even-Dar et al. bandits, the existence of such algorithms for (2002) and Bubeck et al. (2009) are among the first to study the best-arm identification in linear bandits has the pure exploration setting for stochastic bandits. A nonexhaustive been elusive despite several attempts to address list of pure exploration game includes best-arm it. First, we provide a thorough comparison and identification (BAI), top-m identification (Kalyanakrishnan new insight over different notions of optimality in & Stone, 2010), threshold bandits (Locatelli et al., 2016), the linear case, including G-optimality, transductive minimum threshold (Kaufmann et al., 2018), signed bandits optimality from optimal experimental design (Ménard, 2019), pure exploration combinatorial bandits and asymptotic optimality.
Jul-2-2020
- Country:
- Europe
- North America > United States (0.28)
- Genre:
- Research Report (0.84)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.42)
- Technology: