Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
In this paper, the authors extend the "resource allocation with semi-bandit feedback", proposed by Lattimore et al. [2014], to the multi-resource case. The paper has provided two regret bounds, one for the worst case (Theorem 2) and the other for the "resource-laden" case (Theorem 7). The authors also provide a new result on the "weighted least squares estimation", which is independently interesting. The paper is well-written and very interesting, the analysis in this paper is also rigorous. The extension to the multi-resource case is non-trivial, and the new result on the "weighted least squares estimation" is very interesting and might be reused by researchers in the field of bandit/RL in the future. Thus, I think this paper meets the acceptance threshold.
Neural Information Processing Systems
Feb-12-2025, 00:11:41 GMT
- Technology: