Plotting

 Game Theory: Instructional Materials



No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

Neural Information Processing Systems

We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers'


No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

Neural Information Processing Systems

We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers'


No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

Neural Information Processing Systems

We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers'


0887f1a5b9970ad13f46b8c1485f7900-Paper.pdf

Neural Information Processing Systems

Routing games are amongst the most studied classes of games in game theory. Their most well-known property is that learning dynamics typically converge to equilibria implying approximately optimal performance (low Price of Anarchy). We perform a stress test for these classic results by studying the ubiquitous learning dynamics, Multiplicative Weights Update (MWU), in different classes of congestion games, uncovering intricate non-equilibrium phenomena. We study MWU using the actual game costs without applying cost normalization to [0, 1]. Although this non-standard assumption leads to large regret, it captures realistic agents' behaviors.



Refining Minimax Regret for Unsupervised Environment Design

arXiv.org Artificial Intelligence

In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax regret (MMR) policy with desirable robustness guarantees; in particular, the agent's maximum regret is bounded. However, once the agent reaches this regret bound on all levels, the adversary will only sample levels where regret cannot be further reduced. Although there are possible performance improvements to be made outside of these regret-maximising levels, learning stagnates. In this work, we introduce Bayesian level-perfect MMR (BLP), a refinement of the minimax regret objective that overcomes this limitation. We formally show that solving for this objective results in a subset of MMR policies, and that BLP policies act consistently with a Perfect Bayesian policy over all levels. We further introduce an algorithm, ReMiDi, that results in a BLP policy at convergence. We empirically demonstrate that training on levels from a minimax regret adversary causes learning to prematurely stagnate, but that ReMiDi continues learning.


A tutorial on learning from preferences and choices with Gaussian Processes

arXiv.org Machine Learning

Preference modelling lies at the intersection of economics, decision theory, machine learning and statistics. By understanding individuals' preferences and how they make choices, we can build products that closely match their expectations, paving the way for more efficient and personalised applications across a wide range of domains. The objective of this tutorial is to present a cohesive and comprehensive framework for preference learning with Gaussian Processes (GPs), demonstrating how to seamlessly incorporate rationality principles (from economics and decision theory) into the learning process. By suitably tailoring the likelihood function, this framework enables the construction of preference learning models that encompass random utility models, limits of discernment, and scenarios with multiple conflicting utilities for both object- and label-preference. This tutorial builds upon established research while simultaneously introducing some novel GP-based models to address specific gaps in the existing literature.


Statistical Games

arXiv.org Machine Learning

This work contains the mathematical exploration of a few prototypical games in which central concepts from statistics and probability theory naturally emerge. The first two kinds of games are termed Fisher and Bayesian games, which are connected to Frequentist and Bayesian statistics, respectively. Later, a more general type of game is introduced, termed Statistical game, in which a further parameter, the players' relative risk aversion, can be set. In this work, we show that Fisher and Bayesian games can be viewed as limiting cases of Statistical games. Therefore, Statistical games can be viewed as a unified framework, incorporating both Frequentist and Bayesian statistics. Furthermore, a philosophical framework is (re-)presented -- often referred to as minimax regret criterion -- as a general approach to decision making. The main motivation for this work was to embed Bayesian statistics into a broader decision-making framework, where, based on collected data, actions with consequences have to be made, which can be translated to utilities (or rewards/losses) of the decision-maker. The work starts with the simplest possible toy model, related to hypothesis testing and statistical inference. This choice has two main benefits: i.) it allows us to determine (conjecture) the behaviour of the equilibrium strategies in various limiting cases ii.) this way, we can introduce Statistical games without requiring additional stochastic parameters. The work contains game theoretical methods related to two-player, non-cooperative games to determine and prove equilibrium strategies of Fisher, Bayesian and Statistical games. It also relies on analytical tools for derivations concerning various limiting cases.


Randomized Algorithms for Scientific Computing (RASC)

arXiv.org Artificial Intelligence

Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.