Paths to Equilibrium in Normal-Form Games

Yongacoglu, Bora, Arslan, Gürdal, Pavel, Lacra, Yüksel, Serdar

Mar-26-2024–arXiv.org Artificial Intelligence

In multi-agent reinforcement learning (MARL), agents repeatedly interact across time and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in period $t$ does not switch its strategy in the next period $t+1$. This constraint merely requires that optimizing agents do not switch strategies, but does not constrain the other non-optimizing agents in any way, and thus allows for exploration. Sequences with this property are called satisficing paths, and arise naturally in many MARL algorithms. A fundamental question about strategic dynamics is such: for a given game and initial strategy profile, is it always possible to construct a satisficing path that terminates at an equilibrium strategy? The resolution of this question has implications about the capabilities or limitations of a class of MARL algorithms. We answer this question in the affirmative for mixed extensions of finite normal-form games.%

algorithm, nash equilibrium, strategy profile, (14 more...)

arXiv.org Artificial Intelligence

Mar-26-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Hawaii (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Greece > Attica
    - Athens (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found