Bandit Learning with Positive Externalities

Virag Shah, Jose Blanchet, Ramesh Johari

Neural Information Processing Systems 

In many platforms, user arrivals exhibit a self-reinforcing behavior: future user arrivals are likely to have preferences similar to users who were satisfied in the past.