Incentivized Bandit Learning with Self-Reinforcing User Preferences

Open in new window