An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints

Neural Information Processing Systems 

In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged.