Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

Kaufmann, Emilie, Koolen, Wouter M., Garivier, Aurélien

Feb-14-2020, 18:26:25 GMT–Neural Information Processing Systems

Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-problem in planning, game tree search and reinforcement learning. We formalize this learning task as the problem of sequentially testing how the minimum mean among a finite set of distributions compares to a given threshold. We develop refined non-asymptotic lower bounds, which show that optimality mandates very different sampling behavior for a low vs high true minimum. We show that Thompson Sampling and the intuitive Lower Confidence Bounds policy each nail only one of these cases. We develop a novel approach that we call Murphy Sampling.

lowest mean, murphy sampling, sequential test, (2 more...)

Neural Information Processing Systems

Feb-14-2020, 18:26:25 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.45)

Industry:
- Leisure & Entertainment > Games (0.65)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.91)
  - Representation & Reasoning > Search (0.65)