Goto

Collaborating Authors

 leader strategy



Inverse Game Theory for Stackelberg Games: the Blessing of Bounded Rationality

Neural Information Processing Systems

One primary objective of game theory is to predict the behaviors of agents through equilibrium concepts in a given game. In practice, however, we may observe some equilibrium behaviors of agents, but the game itself turns out to be unknown.


Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Brero, Gianluca, Eden, Alon, Chakrabarti, Darshan, Gerstgrasser, Matthias, Greenwald, Amy, Li, Vincent, Parkes, David C.

arXiv.org Artificial Intelligence

We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader's learning environment, providing a formulation of the leader's learning problem as a POMDP that we call the Stackelberg POMDP. We prove that the optimal leader's strategy in the Stackelberg game is the optimal policy in our Stackelberg POMDP under a limited set of possible policies, establishing a connection between solving POMDPs and Stackelberg games. We solve our POMDP under a limited set of policy options via the centralized training with decentralized execution framework. For the specific case of followers that are modeled as no-regret learners, we solve an array of increasingly complex settings, including problems of indirect mechanism design where there is turn-taking and limited communication by agents. We demonstrate the effectiveness of our training framework through ablation studies. We also give convergence results for no-regret learners to a Bayesian version of a coarse-correlated equilibrium, extending known results to the case of correlated types.


Non-myopic learning in repeated stochastic games

Crandall, Jacob W.

arXiv.org Artificial Intelligence

In repeated stochastic games (RSGs), an agent must quickly adapt to the behavior of previously unknown associates, who may themselves be learning. This machine-learning problem is particularly challenging due, in part, to the presence of multiple (even infinite) equilibria and inherently large strategy spaces. In this paper, we introduce a method to reduce the strategy space of two-player general-sum RSGs to a handful of expert strategies. This process, called Mega, effectually reduces an RSG to a bandit problem. We show that the resulting strategy space preserves several important properties of the original RSG, thus enabling a learner to produce robust strategies within a reasonably small number of interactions. To better establish strengths and weaknesses of this approach, we empirically evaluate the resulting learning system against other algorithms in three different RSGs.


Robust Learning for Repeated Stochastic Games via Meta-Gaming

Crandall, Jacob W. (Masdar Institute of Science and Technology)

AAAI Conferences

In repeated stochastic games (RSGs), an agent must quickly adapt to the behavior of previously unknown associates, who may themselves be learning. This machine-learning problem is particularly challenging due, in part, to the presence of multiple (even infinite) equilibria and inherently large strategy spaces. In this paper, we introduce a method to reduce the strategy space of two-player general-sum RSGs to a handful of expert strategies. This process, called mega, effectually reduces an RSG to a bandit problem. We show that the resulting strategy space preserves several important properties of the original RSG, thus enabling a learner to produce robust strategies within a reasonably small number of interactions. To better establish strengths and weaknesses of this approach, we empirically evaluate the resulting learning system against other algorithms in three different RSGs.


Towards Optimal Patrol Strategies for Fare Inspection in Transit Systems

Jiang, Albert Xin (University of Southern California) | Yin, Zhengyu (University of Southern California) | Johnson, Matthew P. (University of Southern California) | Tambe, Milind ( University of Southern California ) | Kiekintveld, Christopher (University of Texas at El Paso) | Leyton-Brown, Kevin (University of British Columbia) | Sandholm, Tuomas (Carnegie Mellon University)

AAAI Conferences

In some urban transit systems, passengers are legally required to purchase tickets before entering but are not physically forced to do so. Instead, patrol units move about through the transit system, inspecting tickets of passengers, who face fines for fare evasion. This setting yields the problem of computing optimal patrol strategies satisfying certain temporal and spacial constraints, to deter fare evasion and hence maximize revenue. In this paper we propose an initial model of this problem as a leader-follower Stackelberg game. We then formulate an LP relaxation of this problem and present initial experimental results using real-world ridership data from the Los Angeles Metro Rail system.