A Note on KL-UCB+ Policy for the Stochastic Bandit

Open in new window