Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox

Open in new window