Reviews: Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications

Oct-8-2024, 06:12:01 GMT–Neural Information Processing Systems

Overview: The paper studies combinatorial multi-armed bandit with probabilistically triggered arms and semi-bandit feedback (CMAB-T). The paper improves the existing regret bounds by introducing a new version of the bounded smoothness condition on the reward functions, called triggering probability modulated bounded smoothness condition. The intuition behind this is that hard-to-trigger arms do not contribute much to the expected regret. With this new condition, the authors are able to remove a potentially exponentially large factor in the existing regret bounds. The paper then shows that applications such as influence maximization bandit and combinatorial cascading bandit both satisfy this condition.

application, combinatorial semi-bandit, probabilistically triggered arm, (6 more...)

Neural Information Processing Systems

Oct-8-2024, 06:12:01 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.97)