Reviews: Contextual semibandits via supervised learning oracles
–Neural Information Processing Systems
This paper is very interesting in general, and I believe that it has met the standard of NIPS poster. In particular, to the best of my knowledge, this is the first paper considering contextual combinatorial semi-bandits with *unknown* weights. However, I think some parts of the paper can still be improved, and will appreciate it if the authors polish the final version of the paper accordingly: 1) In Theorem 2: the O(T {2/3}) regret bound is somewhat unsatisfactory since I am expecting an O(T {1/2}) regret bound. If the authors believe that the O(T {2/3}) regret bound is intrinsic, please discuss. If the authors believe that it is due to unsatisfactory analysis, please also discuss (i.e. which step of the analysis leads to this non-tight regret bound). Please rewrite the motivation and explanation of the algorithm.
Neural Information Processing Systems
Jan-20-2025, 21:26:37 GMT
- Technology: