On Interruptible Pure Exploration in Multi-Armed Bandits
Shleyfman, Alexander (Technion – Israel Institute of Technology) | Komenda, Antonín (Czech Technical University in Prague) | Domshlak, Carmel (Technion – Israel Institute of Technology)
Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.
Mar-6-2015
- Country:
- North America > United States
- California (0.04)
- Europe
- Czechia > Prague (0.04)
- Slovenia > Central Slovenia
- Municipality of Komenda > Komenda (0.04)
- Asia > Middle East
- Israel > Haifa District > Haifa (0.04)
- North America > United States
- Technology: