Goto

Collaborating Authors

 Game Theory: Overviews


A Survey on Data Pricing: from Economics to Data Science

arXiv.org Artificial Intelligence

How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.


Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games

arXiv.org Artificial Intelligence

Many real-world multi-agent interactions consider multiple distinct criteria, i.e. the payoffs are multi-objective in nature. However, the same multi-objective payoff vector may lead to different utilities for each participant. Therefore, it is essential for an agent to learn about the behaviour of other agents in the system. In this work, we present the first study of the effects of such opponent modelling on multi-objective multi-agent interactions with non-linear utilities. Specifically, we consider two-player multi-objective normal form games with non-linear utility functions under the scalarised expected returns optimisation criterion. We contribute novel actor-critic and policy gradient formulations to allow reinforcement learning of mixed strategies in this setting, along with extensions that incorporate opponent policy reconstruction and learning with opponent learning awareness (i.e., learning while considering the impact of one's policy when anticipating the opponent's learning step). Empirical results in five different MONFGs demonstrate that opponent learning awareness and modelling can drastically alter the learning dynamics in this setting. When equilibria are present, opponent modelling can confer significant benefits on agents that implement it. When there are no Nash equilibria, opponent learning awareness and modelling allows agents to still converge to meaningful solutions that approximate equilibria.


Modeling and Prediction of Human Driver Behavior: A Survey

arXiv.org Artificial Intelligence

We present a review and taxonomy of 200 models from the literature on driver behavior modeling. We begin by introducing a mathematical formulation based on the partially observable stochastic game, which serves as a common framework for comparing and contrasting different driver models. Our taxonomy is constructed around the core modeling tasks of state estimation, intention estimation, trait estimation, and motion prediction, and also discusses the auxiliary tasks of risk estimation, anomaly detection, behavior imitation and microscopic traffic simulation. Existing driver models are categorized based on the specific tasks they address and key attributes of their approach.


Evolutionary Processes in Quantum Decision Theory

arXiv.org Artificial Intelligence

In recent years, there has appeared high interest to the possibility of formulating decision theory in the language of quantum mechanics. Numerous references on this topic can be found in the books [1-4] and review articles [5-8]. This interest is caused by the inability of classical decision theory [9] to comply with the behaviour of real decision makers, which requires to develop other approaches. Resorting to the techniques of quantum theory gives hopes for a better representation of behavioral decision making. There are several variants of using quantum mechanics for interpreting conscious effects.


AI and Game Theory - A Primer

#artificialintelligence

Game Theory, quite unlike its name, is a serious affair to deal with when it comes to the configuration and planning of an AI model. In essence, while linear machine learning deals largely with single-dimensional elements in their very nature, the true power of AI is actually unleashed with game theory application, and it's various facets. To understand game theory power in AI, however, it is essential to understand the basics of what actually constitutes game theory and its applications. So here's the promised primer on what game theory actually comprises. In its textbook definition, "Game Theory is the study of strategic interaction".


Multiagent Evaluation under Incomplete Information

arXiv.org Artificial Intelligence

This paper investigates the evaluation of learned multiagent strategies in the incomplete information setting, which plays a critical role in ranking and training of agents. Traditionally, researchers have relied on Elo ratings for this purpose, with recent works also using methods based on Nash equilibria. Unfortunately, Elo is unable to handle intransitive agent interactions, and other techniques are restricted to zero-sum, two-player settings or are limited by the fact that the Nash equilibrium is intractable to compute. Recently, a ranking method called {\alpha}-Rank, relying on a new graph-based game-theoretic solution concept, was shown to tractably apply to general games. However, evaluations based on Elo or {\alpha}-Rank typically assume noise-free game outcomes, despite the data often being collected from noisy simulations, making this assumption unrealistic in practice. This paper investigates multiagent evaluation in the incomplete information regime, involving general-sum many-player games with noisy outcomes. We derive sample complexity guarantees required to confidently rank agents in this setting. We propose adaptive algorithms for accurate ranking, provide correctness and sample complexity guarantees, then introduce a means of connecting uncertainties in noisy match outcomes to uncertainties in rankings. We evaluate the performance of these approaches in several domains, including Bernoulli games, a soccer meta-game, and Kuhn poker.


OpenSpiel: A Framework for Reinforcement Learning in Games

arXiv.org Artificial Intelligence

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully- observable) grid worlds and social dilemmas. OpenSpiel also includes tools to analyze learning dynamics and other common evaluation metrics. This document serves both as an overview of the code base and an introduction to the terminology, core concepts, and algorithms across the fields of reinforcement learning, computational game theory, and search.


Extra-gradient with player sampling for provable fast convergence in n-player games

arXiv.org Machine Learning

Data-driven model training is increasingly relying on finding Nash equilibria with provable techniques, e.g., for GANs and multi-agent RL. In this paper, we analyse a new extra-gradient method, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits the same rate of convergence as full extra-gradient in non-smooth convex games. We propose an additional variance reduction mechanism for this to hold for smooth convex games. Our approach makes extrapolation amenable to massive multiplayer settings, and brings empirical speed-ups, in particular when using cyclic sampling schemes. We demonstrate the efficiency of player sampling on large-scale non-smooth and non-strictly convex games. We show that the joint use of extrapolation and player sampling allows to train better GANs on CIFAR10.


Augmented Utilitarianism for AGI Safety

arXiv.org Artificial Intelligence

In the light of ongoing progresses of research on artificial intelligent systems exhibiting a steadily increasing problem-solving ability, the identification of practicable solutions to the value alignment problem in AGI Safety is becoming a matter of urgency. In this context, one preeminent challenge that has been addressed by multiple researchers is the adequate formulation of utility functions or equivalents reliably capturing human ethical conceptions. However, the specification of suitable utility functions harbors the risk of "perverse instantiation" for which no final consensus on responsible proactive countermeasures has been achieved so far. Amidst this background, we propose a novel socio-technological ethical framework denoted Augmented Utilitarianism which directly alleviates the perverse instantiation problem. We elaborate on how augmented by AI and more generally science and technology, it might allow a society to craft and update ethical utility functions while jointly undergoing a dynamical ethical enhancement. Further, we elucidate the need to consider embodied simulations in the design of utility functions for AGIs aligned with human values. Finally, we discuss future prospects regarding the usage of the presented scientifically grounded ethical framework and mention possible challenges.


A Complexity Approach for Core-Selecting Exchange under Conditionally Lexicographic Preferences

Journal of Artificial Intelligence Research

Core-selection is a crucial property of rules in the literature of resource allocation. It is also desirable, from the perspective of mechanism design, to address the incentive of agents to cheat by misreporting their preferences. This paper investigates the exchange problem where (i) each agent is initially endowed with (possibly multiple) indivisible goods, (ii) agents' preferences are assumed to be conditionally lexicographic, and (iii) side payments are prohibited. We propose an exchange rule called augmented top-trading-cycles (ATTC), based on the original TTC procedure. We first show that ATTC is core-selecting and runs in polynomial time with respect to the number of goods. We then show that finding a beneficial misreport under ATTC is NP-hard. We finally clarify relationship of misreporting with splitting and hiding, two different types of manipulations, under ATTC.