Filos-Ratsikas, Aris
Pushing the Frontier on Approximate EFX Allocations
Amanatidis, Georgios, Filos-Ratsikas, Aris, Sgouritsa, Alkmini
We study the problem of allocating a set of indivisible goods to a set of agents with additive valuation functions, aiming to achieve approximate envy-freeness up to any good ($\alpha$-EFX). The state-of-the-art results on the problem include that (exact) EFX allocations exist when (a) there are at most three agents, or (b) the agents' valuation functions can take at most two values, or (c) the agents' valuation functions can be represented via a graph. For $\alpha$-EFX, it is known that a $0.618$-EFX allocation exists for any number of agents with additive valuation functions. In this paper, we show that $2/3$-EFX allocations exist when (a) there are at most \emph{seven agents}, (b) the agents' valuation functions can take at most \emph{three values}, or (c) the agents' valuation functions can be represented via a \emph{multigraph}. Our results can be interpreted in two ways. First, by relaxing the notion of EFX to $2/3$-EFX, we obtain existence results for strict generalizations of the settings for which exact EFX allocations are known to exist. Secondly, by imposing restrictions on the setting, we manage to beat the barrier of $0.618$ and achieve an approximation guarantee of $2/3$. Therefore, our results push the \emph{frontier} of existence and computation of approximate EFX allocations, and provide insights into the challenges of settling the existence of exact EFX allocations.
A Few Queries Go a Long Way: Information-Distortion Tradeoffs in Matching
Amanatidis, Georgios (University of Essex) | Birmpas, Georgios (Sapienza University of Rome) | Filos-Ratsikas, Aris (University of Liverpool) | Voudouris, Alexandros A. (University of Essex)
We consider the One-Sided Matching problem, where n agents have preferences over n items, and these preferences are induced by underlying cardinal valuation functions. The goal is to match every agent to a single item so as to maximize the social welfare. Most of the related literature, however, assumes that the values of the agents are not a priori known, and only access to the ordinal preferences of the agents over the items is provided. Consequently, this incomplete information leads to loss of efficiency, which is measured by the notion of distortion. In this paper, we further assume that the agents can answer a small number of queries, allowing us partial access to their values. We study the interplay between elicited cardinal information (measured by the number of queries per agent) and distortion for One-Sided Matching, as well as a wide range of well-studied related problems. Qualitatively, our results show that with a limited number of queries, it is possible to obtain significant improvements over the classic setting, where only access to ordinal information is given.
Achieving Diverse Objectives with AI-driven Prices in Deep Reinforcement Learning Multi-agent Markets
Danassis, Panayiotis, Filos-Ratsikas, Aris, Faltings, Boi
We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Compared to the idealized market equilibrium outcome -- which we use as a benchmark -- our policymaker is much more flexible, allowing us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. To evaluate our approach, we design a realistic market with multiple and diverse buyers and sellers. Additionally, the sellers, which are deep learning agents themselves, compete for resources in a common-pool appropriation environment based on bio-economic models of commercial fisheries. We demonstrate that: (a) The introduced policymaker is able to achieve comparable performance to the market equilibrium, showcasing the potential of such approaches in markets where the equilibrium prices can not be efficiently computed. (b) Our policymaker can notably outperform the equilibrium solution on certain metrics, while at the same time maintaining comparable performance for the remaining ones. (c) As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market outcome, in scarce resource environments.
Rewarding High-Quality Data via Influence Functions
Richardson, Adam, Filos-Ratsikas, Aris, Faltings, Boi
We consider a crowdsourcing data acquisition scenario, such as federated learning, where a Center collects data points from a set of rational Agents, with the aim of training a model. For linear regression models, we show how a payment structure can be designed to incentivize the agents to provide high-quality data as early as possible, based on a characterization of the influence that data points have on the loss function of the model. Our contributions can be summarized as follows: (a) we prove theoretically that this scheme ensures truthful data reporting as a game-theoretic equilibrium and further demonstrate its robustness against mixtures of truthful and heuristic data reports, (b) we design a procedure according to which the influence computation can be efficiently approximated and processed sequentially in batches over time, (c) we develop a theory that allows correcting the difference between the influence and the overall change in loss and (d) we evaluate our approach on real datasets, confirming our theoretical findings.
Anytime Heuristic for Weighted Matching Through Altruism-Inspired Behavior
Danassis, Panayiotis, Filos-Ratsikas, Aris, Faltings, Boi
We present a novel anytime heuristic (ALMA), inspired by the human principle of altruism, for solving the assignment problem. ALMA is decentralized, completely uncoupled, and requires no communication between the participants. We prove an upper bound on the convergence speed that is polynomial in the desired number of resources and competing agents per resource; crucially, in the realistic case where the aforementioned quantities are bounded independently of the total number of agents/resources, the convergence time remains constant as the total problem size increases. We have evaluated ALMA under three test cases: (i) an anti-coordination scenario where agents with similar preferences compete over the same set of actions, (ii) a resource allocation scenario in an urban environment, under a constant-time constraint, and finally, (iii) an on-line matching scenario using real passenger-taxi data. In all of the cases, ALMA was able to reach high social welfare, while being orders of magnitude faster than the centralized, optimal algorithm. The latter allows our algorithm to scale to realistic scenarios with hundreds of thousands of agents, e.g., vehicle coordination in urban environments.
The Complexity of Splitting Necklaces and Bisecting Ham Sandwiches
Filos-Ratsikas, Aris, Goldberg, Paul W.
We resolve the computational complexity of two problems known as NECKLACE-SPLITTING and DISCRETE HAM SANDWICH, showing that they are PPA-complete. We do this via a PPA-completeness result for an approximate version of the CONSENSUS-HALVING problem, strengthening our recent result that the problem is PPA-complete for inverse-exponential precision. At the heart of our construction is a smooth embedding of the high-dimensional M\"{o}bius strip in the CONSENSUS-HALVING problem. These results settle the status of PPA as a class that captures the complexity of "natural'" problems whose definitions do not incorporate a circuit.
Reinforcement Mechanism Design for e-commerce
Cai, Qingpeng, Filos-Ratsikas, Aris, Tang, Pingzhong, Zhang, Yiwei
We study the problem of allocating impressions to sellers in e-commerce websites, such as Amazon, eBay or Taobao, aiming to maximize the total revenue generated by the platform. We employ a general framework of reinforcement mechanism design, which uses deep reinforcement learning to design efficient algorithms, taking the strategic behaviour of the sellers into account. Specifically, we model the impression allocation problem as a Markov decision process, where the states encode the history of impressions, prices, transactions and generated revenue and the actions are the possible impression allocations in each round. To tackle the problem of continuity and high-dimensionality of states and actions, we adopt the ideas of the DDPG algorithm to design an actor-critic policy gradient algorithm which takes advantage of the problem domain in order to achieve convergence and stability. We evaluate our proposed algorithm, coined IA(GRU), by comparing it against DDPG, as well as several natural heuristics, under different rationality models for the sellers - we assume that sellers follow well-known no-regret type strategies which may vary in their degree of sophistication. We find that IA(GRU) outperforms all algorithms in terms of the total revenue.
Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce
Cai, Qingpeng (Tsinghua University) | Filos-Ratsikas, Aris (University of Oxford) | Tang, Pingzhong (Tsinghua University) | Zhang, Yiwei (UC Berkeley)
In large e-commerce websites, sellers have been observed to engage in fraudulent behaviour, faking historical transactions in order to receive favourable treatment from the platforms, specifically through the allocation of additional buyer impressions which results in higher revenue for them, but not for the system as a whole. This emergent phenomenon has attracted considerable attention, with previous approaches focusing on trying to detect illicit practices and to punish the miscreants. In this paper, we employ the principles of reinforcement mechanism design, a framework that combines the fundamental goals of classical mechanism design, i.e. the consideration of agents' incentives and their alignment with the objectives of the designer, with deep reinforcement learning for optimizing the performance based on these incentives. In particular, first we set up a deep-learning framework for predicting the sellers' rationality, based on real data from any allocation algorithm. We use data from one of largest e-commerce platforms worldwide and train a neural network model to predict the extent to which the sellers will engage in fraudulent behaviour. Using this rationality model, we employ an algorithm based on deep reinforcement learning to optimize the objectives and compare its performance against several natural heuristics, including the platform's implementation and incentive-based mechanisms from the related literature.
Facility Location with Double-Peaked Preferences
Filos-Ratsikas, Aris (Aarhus University) | Li, Minming (City University of Hong Kong) | Zhang, Jie (University of Oxford) | Zhang, Qiang ( University of Warsaw )
We study the problem of locating a single facility on a real line based on the reports of self-interested agents, when agents have double-peaked preferences, with the peaks being on opposite sides of their locations.We observe that double-peaked preferences capture real-life scenarios and thus complement the well-studied notion of single-peaked preferences. We mainly focus on the case where peaks are equidistant from the agents’ locations and discuss how our results extend to more general settings. We show that most of the results for single-peaked preferences do not directly apply to this setting; this makes the problem essentially more challenging. As our main contribution, we present a simple truthful-in-expectation mechanism that achieves an approximation ratio of 1+b/c for both the social and the maximum cost, where b is the distance of the agent from the peak and c is the minimum cost of an agent. For the latter case, we provide a 3/2 lower bound on the approximation ratio of any truthful-in-expectation mechanism. We also study deterministic mechanisms under some natural conditions, proving lower bounds and approximation guarantees. We prove that among a large class of reasonable mechanisms, there is no deterministic mechanism that outpeforms our truthful-in-expectation mechanism.
The Fisher Market Game: Equilibrium and Welfare
Brânzei, Simina (Aarhus University) | Chen, Yiling (Harvard University) | Deng, Xiaotie (Shanghai Jiao Tong University) | Filos-Ratsikas, Aris (Aarhus University) | Frederiksen, Søren Kristoffer Stiil (Aarhus University) | Zhang, Jie (University of Oxford)
The Fisher market model is one of the most fundamental resource allocation models in economics. In a Fisher market, the prices and allocations of goods are determined according to the preferences and budgets of buyers to clear the market. In a Fisher market game, however, buyers are strategic and report their preferences over goods; the market-clearing prices and allocations are then determined based on their reported preferences rather than their real preferences. We show that the Fisher market game always has a pure Nash equilibrium, for buyers with linear, Leontief, and Cobb-Douglas utility functions, which are three representative classes of utility functions in the important Constant Elasticity of Substitution (CES) family. Furthermore, to quantify the social efficiency, we prove Price of Anarchy bounds for the game when the utility functions of buyers fall into these three classes respectively.