Learning to Play General-Sum Games Against Multiple Boundedly Rational Agents
Zhao, Eric, Trott, Alexander R., Xiong, Caiming, Zheng, Stephan
–arXiv.org Artificial Intelligence
We study the problem of training a principal in a multi-agent general-sum game using reinforcement learning (RL). Learning a robust principal policy requires anticipating the worst possible strategic responses of other agents, which is generally NP-hard. However, we show that no-regret dynamics can identify these worst-case responses in poly-time in smooth games. We propose a framework that uses this policy evaluation method for efficiently learning a robust principal policy using RL. This framework can be extended to provide robustness to boundedly rational agents too. Our motivating application is automated mechanism design: we empirically demonstrate our framework learns robust mechanisms in both matrix games and complex spatiotemporal games. In particular, we learn a dynamic tax policy that improves the welfare of a simulated trade-and-barter economy by 15%, even when facing previously unseen boundedly rational RL taxpayers.
arXiv.org Artificial Intelligence
Dec-19-2022
- Country:
- Europe
- Austria (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- North America > United States
- California
- Alameda County > Berkeley (0.14)
- Los Angeles County
- Long Beach (0.04)
- Los Angeles (0.04)
- Santa Clara County > Palo Alto (0.04)
- Colorado > Denver County
- Denver (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Michigan (0.04)
- California
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Europe
- Genre:
- Research Report (0.82)
- Industry:
- Government > Tax (0.67)
- Leisure & Entertainment > Games
- Computer Games (0.46)
- Technology: