Nash Convergence of Mean-Based Learning Algorithms in First Price Auctions

Deng, Xiaotie, Hu, Xinyan, Lin, Tao, Zheng, Weiqiang

arXiv.org Artificial Intelligence 

A fundamental question in the field of Learning and Games is Nash convergence of online learning dynamics: if the players in a repeated game employ some online learning algorithms to adjust strategies, will their strategies converge to the Nash equilibrium of the game? Although the answer to this question is "no" in general (see Related Works for details), positive results do exist for some special cases of online learning algorithms and games: for example, no-regret learning algorithms provably converge to Nash equilibria in zero-sum games, 2 2 games, and routing games (see e.g., Fudenberg and Levine, 1998; Cesa-Bianchi and Lugosi, 2006; Nisan et al., 2007). In this work, we analyze Nash convergence of online learning dynamics in repeated auctions, where bidders learn to bid using online learning algorithms. Although auctions are of both theoretical and practical importance, little is known about their Nash convergence properties, even for the perhaps simplest and most popular auction, the single-item first-price sealed-bid auction (or first price auction for short). One of the obstacles to the theoretical analysis of Nash convergence in the first price auction is the lack of explicit characterization of its Nash equilibrium.