Non-Asymptotic Pure Exploration by Solving Games

Degenne, Rémy, Koolen, Wouter M., Ménard, Pierre

Mar-19-2020, 02:33:05 GMT–Neural Information Processing Systems

Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment. Good algorithms make few mistakes and take few samples. Lower bounds (for multi-armed bandit models with arms in an exponential family) reveal that the sample complexity is determined by the solution to an optimisation problem. The existing state of the art algorithms achieve asymptotic optimality by solving a plug-in estimate of that optimisation problem at each step. We interpret the optimisation problem as an unknown game, and propose sampling rules based on iterative strategies to estimate and converge to its saddle point.

exponential family, non-asymptotic pure exploration, optimisation problem

Neural Information Processing Systems

Mar-19-2020, 02:33:05 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.51)