Fujimoto, Yuma
Time-Varyingness in Auction Breaks Revenue Equivalence
Fujimoto, Yuma, Ariu, Kaito, Abe, Kenshi
Auction is one of the most representative buying-selling systems. A celebrated study shows that the seller's expected revenue is equal in equilibrium, regardless of the type of auction, typically first-price and second-price auctions. Here, however, we hypothesize that when some auction environments vary with time, this revenue equivalence may not be maintained. In second-price auctions, the equilibrium strategy is robustly feasible. Conversely, in first-price auctions, the buyers must continue to adapt their strategies according to the environment of the auction. Surprisingly, we prove that revenue equivalence can be broken in both directions. First-price auctions bring larger or smaller revenue than second-price auctions, case by case, depending on how the value of an item varies. Our experiments also demonstrate revenue inequivalence in various scenarios, where the value varies periodically or randomly. This study uncovers a phenomenon, the breaking of revenue equivalence by the time-varyingness in auctions, that likely occurs in real-world auctions, revealing its underlying mechanism.
Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry
Fujimoto, Yuma, Ariu, Kaito, Abe, Kenshi
This study examines the global behavior of dynamics in learning in games between two players, X and Y. We consider the simplest situation for memory asymmetry between two players: X memorizes the other Y's previous action and uses reactive strategies, while Y has no memory. Although this memory complicates the learning dynamics, we discover two novel quantities that characterize the global behavior of such complex dynamics. One is an extended Kullback-Leibler divergence from the Nash equilibrium, a well-known conserved quantity from previous studies. The other is a family of Lyapunov functions of X's reactive strategy. These two quantities capture the global behavior in which X's strategy becomes more exploitative, and the exploited Y's strategy converges to the Nash equilibrium. Indeed, we theoretically prove that Y's strategy globally converges to the Nash equilibrium in the simplest game equipped with an equilibrium in the interior of strategy spaces. Furthermore, our experiments also suggest that this global convergence is universal for more advanced zero-sum games than the simplest game. This study provides a novel characterization of the global behavior of learning in games through a couple of indicators.
Nash Equilibrium and Learning Dynamics in Three-Player Matching $m$-Action Games
Fujimoto, Yuma, Ariu, Kaito, Abe, Kenshi
Learning in games discusses the processes where multiple players learn their optimal strategies through the repetition of game plays. The dynamics of learning between two players in zero-sum games, such as matching pennies, where their benefits are competitive, have already been well analyzed. However, it is still unexplored and challenging to analyze the dynamics of learning among three players. In this study, we formulate a minimalistic game where three players compete to match their actions with one another. Although interaction among three players diversifies and complicates the Nash equilibria, we fully analyze the equilibria. We also discuss the dynamics of learning based on some famous algorithms categorized into Follow the Regularized Leader. From both theoretical and experimental aspects, we characterize the dynamics by categorizing three-player interactions into three forces to synchronize their actions, switch their actions rotationally, and seek competition.
Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games
Fujimoto, Yuma, Ariu, Kaito, Abe, Kenshi
Learning in games considers how multiple agents maximize their own rewards through repeated games. Memory, an ability that an agent changes his/her action depending on the history of actions in previous games, is often introduced into learning to explore more clever strategies and discuss the decision-making of real agents like humans. However, such games with memory are hard to analyze because they exhibit complex phenomena like chaotic dynamics or divergence from Nash equilibrium. In particular, how asymmetry in memory capacities between agents affects learning in games is still unclear. In response, this study formulates a gradient ascent algorithm in games with asymmetry memory capacities. To obtain theoretical insights into learning dynamics, we first consider a simple case of zero-sum games. We observe complex behavior, where learning dynamics draw a heteroclinic connection from unstable fixed points to stable ones. Despite this complexity, we analyze learning dynamics and prove local convergence to these stable fixed points, i.e., the Nash equilibria. We identify the mechanism driving this convergence: an agent with a longer memory learns to exploit the other, which in turn endows the other's utility function with strict concavity. We further numerically observe such convergence in various initial strategies, action numbers, and memory lengths. This study reveals a novel phenomenon due to memory asymmetry, providing fundamental strides in learning in games and new insights into computing equilibria.
Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium
Fujimoto, Yuma, Ariu, Kaito, Abe, Kenshi
Repeated games consider a situation where multiple agents are motivated by their independent rewards throughout learning. In general, the dynamics of their learning become complex. Especially when their rewards compete with each other like zero-sum games, the dynamics often do not converge to their optimum, i.e., the Nash equilibrium. To tackle such complexity, many studies have understood various learning algorithms as dynamical systems and discovered qualitative insights among the algorithms. However, such studies have yet to handle multi-memory games (where agents can memorize actions they played in the past and choose their actions based on their memories), even though memorization plays a pivotal role in artificial intelligence and interpersonal relationship. This study extends two major learning algorithms in games, i.e., replicator dynamics and gradient ascent, into multi-memory games. Then, we prove their dynamics are identical. Furthermore, theoretically and experimentally, we clarify that the learning dynamics diverge from the Nash equilibrium in multi-memory zero-sum games and reach heteroclinic cycles (sojourn longer around the boundary of the strategy space), providing a fundamental advance in learning in games.
Evolutionary stability of cooperation in indirect reciprocity under noisy and private assessment
Fujimoto, Yuma, Ohtsuki, Hisashi
Indirect reciprocity is a mechanism that explains large-scale cooperation in humans. In indirect reciprocity, individuals use reputations to choose whether or not to cooperate with a partner and update others' reputations. A major question is how the rules to choose their actions and the rules to update reputations evolve. In the public reputation case, where all individuals share the evaluation of others, social norms called Simple Standing (SS) and Stern Judging (SJ) have been known to maintain cooperation. However, in the case of private assessment where individuals independently evaluate others, the mechanism of maintenance of cooperation is still largely unknown. This study theoretically shows for the first time that cooperation by indirect reciprocity can be evolutionarily stable under private assessment. Specifically, we find that SS can be stable, but SJ can never be. This is intuitive because SS can correct interpersonal discrepancies in reputations through its simplicity. On the other hand, SJ is too complicated to avoid an accumulation of errors, which leads to the collapse of cooperation. We conclude that moderate simplicity is a key to success in maintaining cooperation under the private assessment. Our result provides a theoretical basis for evolution of human cooperation.