Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems 

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. In this problem there is a hidden unknown parameter \theta*\in R^d and a finite set of arms X\subseteq R^d. When an arm is pulled you observe a reward x^T\theta*+epsilon where epsilon is a zero mean i.i.d noise with bounded range. The goal is to identify argmax_{x\in X} x^T\theta* with the least number of samples. Results: The paper characterizes the sample complexity of static and dynamic allocation strategies to identify the best arm.