Cornering Stationary and Restless Mixing Bandits with Remix-UCB

Neural Information Processing Systems 

We study the restless bandit problem where arms are associated with stationary ϕ-mixing processes and where rewards are therefore dependent: the question that arises from this setting is that of carefully recovering some independence by'ignoring' the values of some rewards.