Incentivized Exploration via Filtered Posterior Sampling
Kalvit, Anand, Slivkins, Aleksandrs, Gur, Yonatan
–arXiv.org Artificial Intelligence
A principal(social planner) interacts sequentially with a flow of self-interested agents that each take actions, consume information, and produce information over time. The planner's goal is to maximize the aggregate utility of all agents it interacts with, which necessitates agents to occasionally take exploratory actions that might otherwise be deemed inferior from an empirical standpoint. While such exploratory actions are the cornerstone of online learning as they help the principal learn the best actions over time, they also represent misaligned incentives between the principal and individual agents. How can a welfare-maximizing principal achieve her goal in the presence of such misaligned incentives? This is the essence of the incentivized exploration problem.
arXiv.org Artificial Intelligence
Feb-20-2024
- Country:
- North America > United States (0.28)
- Genre:
- Research Report (0.50)
- Industry:
- Education (0.66)
- Technology: