Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

David Simchi-Levi, Yunzong Xu

Neural Information Processing Systems 

MAB problem, the learner (i.e., decision-maker) is allowed to switch freely between actions, and