Brânzei, Simina
Dueling Over Dessert, Mastering the Art of Repeated Cake Cutting
Brânzei, Simina, Hajiaghayi, MohammadTaghi, Phillips, Reed, Shin, Suho, Wang, Kun
We consider the setting of repeated fair division between two players, denoted Alice and Bob, with private valuations over a cake. In each round, a new cake arrives, which is identical to the ones in previous rounds. Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice. We consider two versions: sequential, where Bob observes Alice's cut point before choosing left/right, and simultaneous, where he only observes her cut point after making his choice. The simultaneous version was first considered by Aumann and Maschler (1995). We observe that if Bob is almost myopic and chooses his favorite piece too often, then he can be systematically exploited by Alice through a strategy akin to a binary search. This strategy allows Alice to approximate Bob's preferences with increasing precision, thereby securing a disproportionate share of the resource over time. We analyze the limits of how much a player can exploit the other one and show that fair utility profiles are in fact achievable. Specifically, the players can enforce the equitable utility profile of $(1/2, 1/2)$ in the limit on every trajectory of play, by keeping the other player's utility to approximately $1/2$ on average while guaranteeing they themselves get at least approximately $1/2$ on average. We show this theorem using a connection with Blackwell approachability. Finally, we analyze a natural dynamic known as fictitious play, where players best respond to the empirical distribution of the other player. We show that fictitious play converges to the equitable utility profile of $(1/2, 1/2)$ at a rate of $O(1/\sqrt{T})$.
Multiplayer Bandit Learning, from Competition to Cooperation
Brânzei, Simina, Peres, Yuval
The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. We study the effects of competition and cooperation on this tradeoff. Suppose there are $k$ arms and two players, Alice and Bob. In every round, each player pulls an arm, receives the resulting reward, and observes the choice of the other player but not their reward. Alice's utility is $\Gamma_A + \lambda \Gamma_B$ (and similarly for Bob), where $\Gamma_A$ is Alice's total reward and $\lambda \in [-1, 1]$ is a cooperation parameter. At $\lambda = -1$ the players are competing in a zero-sum game, at $\lambda = 1$, they are fully cooperating, and at $\lambda = 0$, they are neutral: each player's utility is their own reward. The model is related to the economics literature on strategic experimentation, where usually players observe each other's rewards. With discount factor $\beta$, the Gittins index reduces the one-player problem to the comparison between a risky arm, with a prior $\mu$, and a predictable arm, with success probability $p$. The value of $p$ where the player is indifferent between the arms is the Gittins index $g = g(\mu,\beta) > m$, where $m$ is the mean of the risky arm. We show that competing players explore less than a single player: there is $p^* \in (m, g)$ so that for all $p > p^*$, the players stay at the predictable arm. However, the players are not myopic: they still explore for some $p > m$. On the other hand, cooperating players explore more than a single player. We also show that neutral players learn from each other, receiving strictly higher total rewards than they would playing alone, for all $ p\in (p^*, g)$, where $p^*$ is the threshold from the competing case. Finally, we show that competing and neutral players eventually settle on the same arm in every Nash equilibrium, while this can fail for cooperating players.
Online Learning in Multi-unit Auctions
Brânzei, Simina, Derakhshan, Mahsa, Golrezaei, Negin, Han, Yanjun
We consider repeated multi-unit auctions with uniform pricing, which are widely used in practice for allocating goods such as carbon licenses. In each round, $K$ identical units of a good are sold to a group of buyers that have valuations with diminishing marginal returns. The buyers submit bids for the units, and then a price $p$ is set per unit so that all the units are sold. We consider two variants of the auction, where the price is set to the $K$-th highest bid and $(K+1)$-st highest bid, respectively. We analyze the properties of this auction in both the offline and online settings. In the offline setting, we consider the problem that one player $i$ is facing: given access to a data set that contains the bids submitted by competitors in past auctions, find a bid vector that maximizes player $i$'s cumulative utility on the data set. We design a polynomial time algorithm for this problem, by showing it is equivalent to finding a maximum-weight path on a carefully constructed directed acyclic graph. In the online setting, the players run learning algorithms to update their bids as they participate in the auction over time. Based on our offline algorithm, we design efficient online learning algorithms for bidding. The algorithms have sublinear regret, under both full information and bandit feedback structures. We complement our online learning algorithms with regret lower bounds. Finally, we analyze the quality of the equilibria in the worst case through the lens of the core solution concept in the game among the bidders. We show that the $(K+1)$-st price format is susceptible to collusion among the bidders; meanwhile, the $K$-th price format does not have this issue.
Online Learning with an Almost Perfect Expert
Brânzei, Simina, Peres, Yuval
We study the online learning problem where a forecaster makes a sequence of binary predictions using the advice of $n$ experts. Our main contribution is to analyze the regime where the best expert makes at most $b$ mistakes and to show that when $b = o(\log_4{n})$, the expected number of mistakes made by the optimal forecaster is at most $\log_4{n} + o(\log_4{n})$. We also describe an adversary strategy showing that this bound is tight.
An Algorithmic Framework for Strategic Fair Division
Brânzei, Simina (University of California Berkeley) | Caragiannis, Ioannis (University of Patras) | Kurokawa, David (Carnegie Mellon University) | Procaccia, Ariel D. (Carnegie Mellon University)
A large body of literature deals with the so-called cake cutting So how would strategic agents behave when faced with problem -- a misleadingly childish metaphor for the the cut and choose protocol? A standard way of answering challenging and important task of fairly dividing a heterogeneous this question employs the notion of Nash equilibrium: each divisible good among multiple agents (see the recent agent would use a strategy that is a best response to the other survey by Procaccia (2013) and the books by Brams agent's strategy. To set up a Nash equilibrium, suppose that and Taylor (1996) and Robertson and Webb (1998)). In particular, the first agent cuts two pieces that the second agent values there is a significant amount of AI work on cake cutting equally; the second agent selects its more preferred piece, (Procaccia 2009; Caragiannis, Lai, and Procaccia 2011; and the one less preferred by the first agent in case of a tie. Brams et al. 2012; Bei et al. 2012; Aumann, Dombb, Clearly, the second agent cannot gain from deviating, as it is and Hassidim 2013; Kurokawa, Lai, and Procaccia 2013; selecting a piece that is at least as preferred as the other. As Brânzei, Procaccia, and Zhang 2013; Brânzei and Miltersen for the first agent, if it makes its preferred piece even bigger, 2013; Chen et al. 2013; Balkanski et al. 2014; Brânzei the second agent would choose that piece, making the and Miltersen 2015; Segal-Halevi, Hassidim, and Aumann first agent worse off. Interestingly enough, in this equilibrium 2015), which is closely intertwined with emerging realworld the tables are turned; now it is the second agent who applications of fair division more broadly (Goldman is getting exactly half of its value for the whole cake, while and Procaccia 2014; Kurokawa, Procaccia, and Shah 2015).
Simultaneous Cake Cutting
Balkanski, Eric (Carnegie Mellon University) | Brânzei, Simina (Aarhus University) | Kurokawa, David (Carnegie Mellon University) | Procaccia, Ariel D. (Carnegie Mellon University)
We introduce the simultaneous model for cake cutting (the fair allocation of a divisible good), in which agents simultaneously send messages containing a sketch of their preferences over the cake. We show that this model enables the computation of divisions that satisfy proportionality -- a popular fairness notion -- using a protocol that circumvents a standard lower bound via parallel information elicitation. Cake divisions satisfying another prominent fairness notion, envy-freeness, are impossible to compute in the simultaneous model, but admit arbitrarily good approximations.
The Fisher Market Game: Equilibrium and Welfare
Brânzei, Simina (Aarhus University) | Chen, Yiling (Harvard University) | Deng, Xiaotie (Shanghai Jiao Tong University) | Filos-Ratsikas, Aris (Aarhus University) | Frederiksen, Søren Kristoffer Stiil (Aarhus University) | Zhang, Jie (University of Oxford)
The Fisher market model is one of the most fundamental resource allocation models in economics. In a Fisher market, the prices and allocations of goods are determined according to the preferences and budgets of buyers to clear the market. In a Fisher market game, however, buyers are strategic and report their preferences over goods; the market-clearing prices and allocations are then determined based on their reported preferences rather than their real preferences. We show that the Fisher market game always has a pure Nash equilibrium, for buyers with linear, Leontief, and Cobb-Douglas utility functions, which are three representative classes of utility functions in the important Constant Elasticity of Substitution (CES) family. Furthermore, to quantify the social efficiency, we prove Price of Anarchy bounds for the game when the utility functions of buyers fall into these three classes respectively.
Weighted Clustering
Ackerman, Margareta (University of Waterloo) | Ben-David, Shai (University of Waterloo) | Brânzei, Simina (Aarhus University) | Loker, David (University of Waterloo)
We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights. We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify algorithms accordingly.