Best Arm Identification with Fixed Budget: A Large Deviation Perspective

Neural Information Processing Systems 

We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits (MABs) using a fixed sampling budget.