Goto

Collaborating Authors

 replicator dynamic


MatrixMultiplicativeWeightsUpdatesinQuantum Zero-SumGames: ConservationLaws&Recurrence

Neural Information Processing Systems

Moreover, we show that the system exhibits Poincaré recurrence, meaning that almost all orbits return arbitrarily close to their initial conditions infinitely often.


Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part A: Stochastic Stability in Non-zero-Sum Games

Chasparis, Georgios C.

arXiv.org Artificial Intelligence

Reinforcement-based learning has attracted considerable attention both in modeling human behavior as well as in engineering, for designing measurement- or payoff-based optimization schemes. Such learning schemes exhibit several advantages, especially in relation to filtering out noisy observations. However, they may exhibit several limitations when applied in a distributed setup. In multi-player weakly-acyclic games, and when each player applies an independent copy of the learning dynamics, convergence to (usually desirable) pure Nash equilibria cannot be guaranteed. Prior work has only focused on a small class of games, namely potential and coordination games. To address this main limitation, this paper introduces a novel payoff-based learning scheme for distributed optimization, namely aspiration-based perturbed learning automata (APLA). In this class of dynamics, and contrary to standard reinforcement-based learning schemes, each player's probability distribution for selecting actions is reinforced both by repeated selection and an aspiration factor that captures the player's satisfaction level. We provide a stochastic stability analysis of APLA in multi-player positive-utility games under the presence of noisy observations. This is the first part of the paper that characterizes stochastic stability in generic non-zero-sum games by establishing equivalence of the induced infinite-dimensional Markov chain with a finite dimensional one. In the second part, stochastic stability is further specialized to weakly acyclic games.




Can Media Act as a Soft Regulator of Safe AI Development? A Game Theoretical Analysis

da Fonseca, Henrique Correia, Fernandes, António, Song, Zhao, Cimpeanu, Theodor, Balabanova, Nataliya, Bashir, Adeela, Bova, Paolo, Buscemi, Alessio, Di Stefano, Alessandro, Duong, Manh Hong, Domingos, Elias Fernandez, Ogbo, Ndidi Bianca, Powers, Simon T., Proverbio, Daniele, Shamszaman, Zia Ush, Santos, Fernando P., Han, The Anh, Krellner, Marcus

arXiv.org Artificial Intelligence

When developers of artificial intelligence (AI) products need to decide between profit and safety for the users, they likely choose profit. Untrustworthy AI technology must come packaged with tangible negative consequences. Here, we envisage those consequences as the loss of reputation caused by media coverage of their misdeeds, disseminated to the public. We explore whether media coverage has the potential to push AI creators into the production of safe products, enabling widespread adoption of AI technology. We created artificial populations of self-interested creators and users and studied them through the lens of evolutionary game theory. Our results reveal that media is indeed able to foster cooperation between creators and users, but not always. Cooperation does not evolve if the quality of the information provided by the media is not reliable enough, or if the costs of either accessing media or ensuring safety are too high. By shaping public perception and holding developers accountable, media emerges as a powerful soft regulator -- guiding AI safety even in the absence of formal government oversight.


Feedback Linearization for Replicator Dynamics: A Control Framework for Evolutionary Game Convergence

Faisal, Adil

arXiv.org Artificial Intelligence

This paper demonstrates the first application of feedback linearization to replicator dynamics, driving the evolution of non-convergent evolutionary games to systems with guaranteed global asymptotic stability. Replicator dynamics, while a cornerstone of evolutionary game theory, possess neutral stability at Nash equilibria [2], which causes the evolutionary process to oscillate without converging to an optimal strategy. We build a control-theoretic framework that cancels the nonlinear components in replica-tor dynamics, and then apply a linear feedback component to force a strategy change at the Nash equilibrium. Through Lyapunov analysis, we show global convergence from any initial conditions in the probability simplex. We illustrate this approach with a numerical example of a penalty shootout game, where we illustrate that our method guides strategies quickly to mixed Nash equilibria, while the uncontrolled dynamics oscillate. Our work serves as one of the first known connections between nonlinear control theory and evolutionary game dynamics with applications in multi-agent systems, algorithmic trading, and strategic optimization.


Online Learning in Periodic Zero-Sum Games Supplementary MaterialAppendix Organization and Contents

Neural Information Processing Systems

The organization and contents of this appendix is as follows. Following Appendix A, proofs for the theoretical results in the paper are presented in the order that they appeared. Specifically, Appendix B contains the proofs for the results presented in Section 4.1 on This includes the proofs of Proposition 1, Lemma 1, Lemma 2, and Theorem 1. Appendix C, we provide the proof of Proposition 2 from Section 4.2 on the time-average behavior of This appendix includes the proofs of Proposition 1, Lemma 1, Lemma 2, and Theorem 1. B.1 Proof of Proposition 1 This will immediately allow us to conclude the system is not Poincaré recurrent by definition. Given the previous intermediate results, Theorem 1 follows from the arguments presented in Section 3.3. This appendix includes the proofs of Lemma 3, Lemma 4, and Theorem 2. D.1 Proof of Lemma 3 Then, we show that the divergence of this vector field is zero, from which we conclude the dynamics are volume preserving by Liouville's theorem.


The impact of uncertainty on regularized learning in games

Cauvin, Pierre-Louis, Legacci, Davide, Mertikopoulos, Panayotis

arXiv.org Artificial Intelligence

In this paper, we investigate how randomness and uncertainty influence learning in games. Specifically, we examine a perturbed variant of the dynamics of "follow-the-regularized-leader" (FTRL), where the players' payoff observations and strategy updates are continually impacted by random shocks. Our findings reveal that, in a fairly precise sense, "uncertainty favors extremes": in any game, regardless of the noise level, every player's trajectory of play reaches an arbitrarily small neighborhood of a pure strategy in finite time (which we estimate). Moreover, even if the player does not ultimately settle at this strategy, they return arbitrarily close to some (possibly different) pure strategy infinitely often. This prompts the question of which sets of pure strategies emerge as robust predictions of learning under uncertainty. We show that (a) the only possible limits of the FTRL dynamics under uncertainty are pure Nash equilibria; and (b) a span of pure strategies is stable and attracting if and only if it is closed under better replies. Finally, we turn to games where the deterministic dynamics are recurrent - such as zero-sum games with interior equilibria - and we show that randomness disrupts this behavior, causing the stochastic dynamics to drift toward the boundary on average.


Higher-Order Uncoupled Learning Dynamics and Nash Equilibrium

Toonsi, Sarah A., Shamma, Jeff S.

arXiv.org Artificial Intelligence

We study learnability of mixed-strategy Nash Equilibrium (NE) in general finite games using higher-order replicator dynamics as well as classes of higher-order uncoupled heterogeneous dynamics. In higher-order uncoupled learning dynamics, players have no access to utilities of opponents (uncoupled) but are allowed to use auxiliary states to further process information (higher-order). We establish a link between uncoupled learning and feedback stabilization with decentralized control. Using this association, we show that for any finite game with an isolated completely mixed-strategy NE, there exist higher-order uncoupled learning dynamics that lead (locally) to that NE. We further establish the lack of universality of learning dynamics by linking learning to the control theoretic concept of simultaneous stabilization. We construct two games such that any higher-order dynamics that learn the completely mixed-strategy NE of one of these games can never learn the completely mixed-strategy NE of the other. Next, motivated by imposing natural restrictions on allowable learning dynamics, we introduce the Asymptotic Best Response (ABR) property. Dynamics with the ABR property asymptotically learn a best response in environments that are asymptotically stationary. We show that the ABR property relates to an internal stability condition on higher-order learning dynamics. We provide conditions under which NE are compatible with the ABR property. Finally, we address learnability of mixed-strategy NE in the bandit setting using a bandit version of higher-order replicator dynamics.


Sink equilibria and the attractors of learning in games

Biggar, Oliver, Papadimitriou, Christos

arXiv.org Artificial Intelligence

Characterizing the limit behavior -- that is, the attractors -- of learning dynamics is one of the most fundamental open questions in game theory. In recent work in this front, it was conjectured that the attractors of the replicator dynamic are in one-to-one correspondence with the sink equilibria of the game -- the sink strongly connected components of a game's preference graph -- , and it was established that they do stand in at least one-to-many correspondence with them. We make threefold progress on the problem of characterizing attractors. First, we show through a topological construction that the one-to-one conjecture is false. Second, we make progress on the attractor characterization problem for two-player games by establishing that the one-to-one conjecture is true in the absence of a local pattern called a weak local source -- a pattern that is absent from zero-sum games. Finally, we look -- for the first time in this context -- at fictitious play, the longest-studied learning dynamic, and examine to what extent the conjecture generalizes there. We establish that under fictitious play, sink equilibria always contain attractors (sometimes strictly), and every attractor corresponds to a strongly connected set of nodes in the preference graph.