Agents
On Nash Equilibria in Normal-Form Games With Vectorial Payoffs
Röpke, Willem, Roijers, Diederik M., Nowé, Ann, Rădulescu, Roxana
We provide an in-depth study of Nash equilibria in multi-objective normal form games (MONFGs), i.e., normal form games with vectorial payoffs. Taking a utility-based approach, we assume that each player's utility can be modelled with a utility function that maps a vector to a scalar utility. In the case of a mixed strategy, it is meaningful to apply such a scalarisation both before calculating the expectation of the payoff vector as well as after. This distinction leads to two optimisation criteria. With the first criterion, players aim to optimise the expected value of their utility function applied to the payoff vectors obtained in the game. With the second criterion, players aim to optimise the utility of expected payoff vectors given a joint strategy. Under this latter criterion, it was shown that Nash equilibria need not exist. Our first contribution is to provide a sufficient condition under which Nash equilibria are guaranteed to exist. Secondly, we show that when Nash equilibria do exist under both criteria, no equilibrium needs to be shared between the two criteria, and even the number of equilibria can differ. Thirdly, we contribute a study of pure strategy Nash equilibria under both criteria. We show that when assuming quasiconvex utility functions for players, the sets of pure strategy Nash equilibria under both optimisation criteria are equivalent. This result is further extended to games in which players adhere to different optimisation criteria. Finally, given these theoretical results, we construct an algorithm to compute all pure strategy Nash equilibria in MONFGs where players have a quasiconvex utility function.
A Survey of Decision Making in Adversarial Games
Li, Xiuxian, Meng, Min, Hong, Yiguang, Chen, Jie
Game theory has by now found numerous applications in various fields, including economics, industry, jurisprudence, and artificial intelligence, where each player only cares about its own interest in a noncooperative or cooperative manner, but without obvious malice to other players. However, in many practical applications, such as poker, chess, evader pursuing, drug interdiction, coast guard, cyber-security, and national defense, players often have apparently adversarial stances, that is, selfish actions of each player inevitably or intentionally inflict loss or wreak havoc on other players. Along this line, this paper provides a systematic survey on three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games, zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal strategy seeking techniques, prevailing algorithms, and practical applications. Finally, promising future research directions are also discussed for relevant adversarial games.
Relative Position Estimation in Multi-Agent Systems Using Attitude-Coupled Range Measurements
Shalaby, Mohammed, Cossette, Charles Champagne, Forbes, James Richard, Ny, Jerome Le
The ability to accurately estimate the position of robotic agents relative to one another, in possibly GPS-denied environments, is crucial to execute collaborative tasks. Inter-agent range measurements are available at a low cost, due to technologies such as ultra-wideband radio. However, the task of three-dimensional relative position estimation using range measurements in multi-agent systems suffers from unobservabilities. This letter presents a sufficient condition for the observability of the relative positions, and satisfies the condition using a simple framework with only range measurements, an accelerometer, a rate gyro, and a magnetometer. The framework has been tested in simulation and in experiments, where 40-50 cm positioning accuracy is achieved using inexpensive off-the-shelf hardware.
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning
Li, Quanyi, Peng, Zhenghao, Feng, Lan, Zhang, Qihang, Xue, Zhenghai, Zhou, Bolei
Driving safely requires multiple capabilities from human and intelligent agents, such as the generalizability to unseen environments, the safety awareness of the surrounding traffic, and the decision-making in complex multi-agent settings. Despite the great success of Reinforcement Learning (RL), most of the RL research works investigate each capability separately due to the lack of integrated environments. In this work, we develop a new driving simulation platform called MetaDrive to support the research of generalizable reinforcement learning algorithms for machine autonomy. MetaDrive is highly compositional, which can generate an infinite number of diverse driving scenarios from both the procedural generation and the real data importing. Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic. The generalization experiments conducted on both procedurally generated scenarios and real-world scenarios show that increasing the diversity and the size of the training set leads to the improvement of the RL agent's generalizability. We further evaluate various safe reinforcement learning and multi-agent reinforcement learning algorithms in MetaDrive environments and provide the benchmarks. Source code, documentation, and demo video are available at \url{ https://metadriverse.github.io/metadrive}.
Sampling Equilibria: Fast No-Regret Learning in Structured Games
Beaglehole, Daniel, Hopkins, Max, Kane, Daniel, Liu, Sihan, Lovett, Shachar
Learning and equilibrium computation in games are fundamental problems across computer science and economics, with applications ranging from politics to machine learning. Much of the work in this area revolves around a simple algorithm termed \emph{randomized weighted majority} (RWM), also known as "Hedge" or "Multiplicative Weights Update," which is well known to achieve statistically optimal rates in adversarial settings (Littlestone and Warmuth '94, Freund and Schapire '99). Unfortunately, RWM comes with an inherent computational barrier: it requires maintaining and sampling from a distribution over all possible actions. In typical settings of interest the action space is exponentially large, seemingly rendering RWM useless in practice. In this work, we refute this notion for a broad variety of \emph{structured} games, showing it is possible to efficiently (approximately) sample the action space in RWM in \emph{polylogarithmic} time. This gives the first efficient no-regret algorithms for problems such as the \emph{(discrete) Colonel Blotto game}, \emph{matroid congestion}, \emph{matroid security}, and basic \emph{dueling games}. As an immediate corollary, we give a polylogarithmic time meta-algorithm to compute approximate Nash Equilibria for these games that is exponentially faster than prior methods in several important settings. Further, our algorithm is the first to efficiently compute equilibria for more involved variants of these games with general sums, more than two players, and, for Colonel Blotto, multiple resource types.
Cooperative Marine Operations via Ad Hoc Teams
Carlucho, Ignacio, Rahman, Arrasy, Ard, William, Fosong, Elliot, Barbalata, Corina, Albrecht, Stefano V.
While research in ad hoc teamwork has great potential for solving real-world robotic applications, most developments so far have been focusing on environments with simple dynamics. In this article, we discuss how the problem of ad hoc teamwork can be of special interest for marine robotics and how it can aid marine operations. Particularly, we present a set of challenges that need to be addressed for achieving ad hoc teamwork in underwater environments and we discuss possible solutions based on current state-of-the-art developments in the ad hoc teamwork literature.
Data-driven Control of Agent-based Models: an Equation/Variable-free Machine Learning Approach
We present an Equation/Variable free machine learning (EVFML) framework for the control of the collective dynamics of complex/multiscale systems modelled via microscopic/agent-based simulators. The approach obviates the need for construction of surrogate, reduced-order models.~The proposed implementation consists of three steps: (A) from high-dimensional agent-based simulations, machine learning (in particular, non-linear manifold learning (Diffusion Maps (DMs)) helps identify a set of coarse-grained variables that parametrize the low-dimensional manifold on which the emergent/collective dynamics evolve. The out-of-sample extension and pre-image problems, i.e. the construction of non-linear mappings from the high-dimensional input space to the low-dimensional manifold and back, are solved by coupling DMs with the Nystrom extension and Geometric Harmonics, respectively; (B) having identified the manifold and its coordinates, we exploit the Equation-free approach to perform numerical bifurcation analysis of the emergent dynamics; then (C) based on the previous steps, we design data-driven embedded wash-out controllers that drive the agent-based simulators to their intrinsic, imprecisely known, emergent open-loop unstable steady-states, thus demonstrating that the scheme is robust against numerical approximation errors and modelling uncertainty.~The efficiency of the framework is illustrated by controlling emergent unstable (i) traveling waves of a deterministic agent-based model of traffic dynamics, and (ii) equilibria of a stochastic financial market agent model with mimesis.
RILI: Robustly Influencing Latent Intent
Parekh, Sagar, Habibian, Soheil, Losey, Dylan P.
When robots interact with human partners, often these partners change their behavior in response to the robot. On the one hand this is challenging because the robot must learn to coordinate with a dynamic partner. But on the other hand -- if the robot understands these dynamics -- it can harness its own behavior, influence the human, and guide the team towards effective collaboration. Prior research enables robots to learn to influence other robots or simulated agents. In this paper we extend these learning approaches to now influence humans. What makes humans especially hard to influence is that -- not only do humans react to the robot -- but the way a single user reacts to the robot may change over time, and different humans will respond to the same robot behavior in different ways. We therefore propose a robust approach that learns to influence changing partner dynamics. Our method first trains with a set of partners across repeated interactions, and learns to predict the current partner's behavior based on the previous states, actions, and rewards. Next, we rapidly adapt to new partners by sampling trajectories the robot learned with the original partners, and then leveraging those existing behaviors to influence the new partner dynamics. We compare our resulting algorithm to state-of-the-art baselines across simulated environments and a user study where the robot and participants collaborate to build towers. We find that our approach outperforms the alternatives, even when the partner follows new or unexpected dynamics. Videos of the user study are available here: https://youtu.be/lYsWM8An18g
Robot Swarms as Hybrid Systems: Modelling and Verification
Schupp, Stefan, Leofante, Francesco, Behr, Leander, Ábrahám, Erika, Taccella, Armando
Swarm robotic systems are distributed systems wherein a set of robots cooperatively perform a task, without any centralized coordination [29]. Although individual robots are governed by relatively simple reactive controllers, interactions within the swarm may give rise to complex behaviors that were not explicitly programmed. Ultimately, these behaviors enable the swarm to achieve goals that would defy each single robot in isolation, or would require more expensive robots to achieve the same goals as effectively as the swarm does - see, e.g, [4, 33] for some examples. While understanding individual robot behavior is easy, predicting the overall swarm behavior is difficult, and thus engineering controllers for individual robots that will guarantee a desired swarm behavior is not a straightforward task. Traditionally, the analysis of swarms is carried out either by testing real robot implementations, or by computational simulations [20, 22]; however, these approaches provide little guarantees as they suffer from intrinsically incomplete coverage. As suggested by many authors, higher levels of assurance in swarm behavior can be obtained via formal methods [32, 30, 16, 5, 18, 23].
Distributed Control for a Multi-Agent System to Pass through a Connected Quadrangle Virtual Tube
Gao, Yan, Bai, Chenggang, Quan, Quan
In order to guide the multi-agent system in a cluttered environment, a connected quadrangle virtual tube is designed for all agents to keep moving within it, whose basis is called the single trapezoid virtual tube. There is no obstacle inside the tube, namely the area inside the tube can be seen as a safety zone. Then, a distributed swarm controller is proposed for the single trapezoid virtual tube passing problem. This issue is resolved by a gradient vector field method with no local minima. Formal analyses and proofs are made to show that all agents are able to pass the single trapezoid virtual tube. Finally, a modified controller is put forward for convenience in practical use. For the connected quadrangle virtual tube, a modified switching logic is proposed to avoid the deadlock and prevent agents from moving outside the virtual tube. Finally, the effectiveness of the proposed method is validated by numerical simulations and real experiments.