Combinatorial Bandits Revisited
Richard Combes, Mohammad Sadegh Talebi Mazraeh Shahi, Alexandre Proutiere, marc lelarge
–Neural Information Processing Systems
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice.
Neural Information Processing Systems
Oct-2-2025, 00:18:24 GMT
- Country:
- Europe
- France > Île-de-France
- Sweden > Stockholm
- Stockholm (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Europe
- Genre:
- Research Report (0.68)
- Technology: