874f5e53d7ce44f65fbf27a7b9406983-Supplemental-Conference.pdf

Neural Information Processing Systems 

Ensemble sampling serves as apractical approximation to Thompson sampling when maintaining anexact posterior distribution overmodel parameters iscomputationally intractable. In this paper, we establish a regret bound that ensures desirable behavior when ensemble sampling isapplied tothe linear bandit problem.