Review for NeurIPS paper: An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Neural Information Processing Systems 

Additional Feedback: I read the paper with interest, but got a bit disappointed in the end. Asymptotic optimality seems to be the focus of the paper, and this is the point I disagree with. Certainly, having asymptotic optimality is good, but only performing well on that--rather than finite-time optimality--is not enough given that linear contextual bandits have been studied extensively. In particular, a simple epsilon-greedy algorithm with epsilon decreasing to 0 at an appropriate rate is already asymptotically optimal. So in my view, finite-time regret must be the clear performance metric for evaluating an algorithm.