Incentivized Exploration via Filtered Posterior Sampling

Kalvit, Anand, Slivkins, Aleksandrs, Gur, Yonatan

Feb-20-2024–arXiv.org Artificial Intelligence

A principal(social planner) interacts sequentially with a flow of self-interested agents that each take actions, consume information, and produce information over time. The planner's goal is to maximize the aggregate utility of all agents it interacts with, which necessitates agents to occasionally take exploratory actions that might otherwise be deemed inferior from an empirical standpoint. While such exploratory actions are the cornerstone of online learning as they help the principal learn the best actions over time, they also represent misaligned incentives between the principal and individual agents. How can a welfare-maximizing principal achieve her goal in the presence of such misaligned incentives? This is the essence of the incentivized exploration problem.

artificial intelligence, bandit, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Feb-20-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.50)

Industry:
- Education (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Agents (0.86)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found