Bayesian bandits: balancing the exploration-exploitation tradeoff via double sampling

Open in new window