Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification

Neural Information Processing Systems 

We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples. It is known that the probability of misidentifying the best arm is exponentially small to the number of rounds. However, limited characterizations have been discussed on the rate (exponent) of this value. In this paper, we characterize the minimax optimal rate as a result of an optimization over all possible parameters. We introduce two rates, R {\mathrm{go}} and R {\mathrm{go}}_{\infty}, corresponding to lower bounds on the probability of misidentification, each of which is associated with a proposed algorithm.