Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

Neural Information Processing Systems 

We investigate stochastic combinatorial multi-armed bandit with semi-bandit feedback (CMAB).