A KL-LUCB algorithm for Large-Scale Crowdsourcing
Ervin Tanczos, Robert Nowak, Bob Mankoff
–Neural Information Processing Systems
This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.
Neural Information Processing Systems
Oct-4-2024, 04:22:46 GMT
- Country:
- North America > United States
- New York (0.27)
- Wisconsin > Dane County
- Madison (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- North America > United States
- Technology: