Learning in Stackelberg Games with Non-myopic Agents

Haghtalab, Nika, Lykouris, Thodoris, Nietert, Sloan, Wei, Alex

arXiv.org Artificial Intelligence 

Stackelberg games are a canonical model for strategic principal-agent interactions. Consider a defense system that distributes its security resources across high-risk targets prior to attacks being executed; or a tax policymaker who sets rules on when audits are triggered prior to seeing filed tax reports; or a seller who chooses a price prior to knowing a customer's proclivity to buy. In each of these scenarios, a principal first selects an action X and then an agent reacts with an action Y, where X and Y are the principal's and agent's action spaces, respectively. In the examples above, agent actions correspond to which target to attack, how much tax to pay to evade an audit, and how much to purchase, respectively. Typically, the principal wants an that maximizes their payoff when the agent plays a best response = br(); such a pair (,) is a Stackelberg equilibrium. By committing to a strategy, the principal can guarantee they achieve a higher payoff than in the fixed point equilibrium of the corresponding simultaneous-play game. However, finding such a strategy requires knowledge of the agent's payoff function. When faced with unknown agent payoffs, the principal can attempt to learn a best response via repeated interactions with the agent. If a (naïve) agent is unaware that such learning occurs and always plays a best response, the principal can use classical online learning approaches to optimize their own payoff in the stage game.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found