Reviews: Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits
–Neural Information Processing Systems
The paper makes some interesting contribution by proposal a new partial information relaxation to improve the regret bound. The analysis of the partial info relaxation bears some similarity to that in Rahklin and Sridharan, but the inclusion of the Radamacher term is new. By using Lemma 2, the bound is improved. While this is definitely a new result and there are new ideas, the similarity to current literature kind of compels me to give a '3' instead of a '4' for the Novelty/Originality and Technical Constribution scores. Presentation: The paper is concise and well written, but I think that the proof of admissability between 5 and 7 is too "straight-lined".
Neural Information Processing Systems
Jan-20-2025, 21:25:26 GMT
- Technology: