UCB algorithms for multi-armed bandits: Precise regret and adaptive inference

Open in new window