Review for NeurIPS paper: Regret in Online Recommendation Systems

Neural Information Processing Systems 

Weaknesses: My main observation is that the paper does not clearly compares the regret bounds it obtains with existing literature. I find the presentation of the regret bounds to be fairly non-standard and hard to interpret. These are some of my concerns. It seems to me that R_sp(T) is just a standard K-armed bandit lower bound which can be applied here by the reduction to the case where the cluster identity of each item is known, but {p_1, ..., p_K} needs to be learned. On the other hand, R_{ic} just seems to be something coming from running out of items to recommend from the top cluster and a bound on the size of such a cluster because of the sampling from {\alpha}'s initially.