AITopics

doi: 10.1145/3450741.3465263

2106.02921

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Italy (0.06)
North America > United States > Utah > Utah County > Provo (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

arXiv.org Machine LearningJun-5-2021

Unbiased Self-Play

Ohsawa, Shohei

We present a general optimization framework for emergent belief-state representation without any supervision. We employed the common configuration of multiagent reinforcement learning and communication to improve exploration coverage over an environment by leveraging the knowledge of each agent. In this paper, we obtained that recurrent neural nets (RNNs) with shared weights are highly biased in partially observable environments because of their noncooperativity. To address this, we designated an unbiased version of self-play via mechanism design, also known as reverse game theory, to clarify unbiased knowledge at the Bayesian Nash equilibrium. The key idea is to add imaginary rewards using the peer prediction mechanism (Miller et al., 2005), i.e., a mechanism for mutually criticizing information in a decentralized environment. Numerical analyses, including StarCraft exploration tasks with up to 20 agents and off-the-shelf RNNs, demonstrate the state-of-the-art performance.

agent, information, mechanism, (15 more...)

arXiv.org Machine Learning

2106.03007

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Auction-based and Distributed Optimization Approaches for Scheduling Observations in Satellite Constellations with Exclusive Orbit Portions

Picard, Gauthier

We investigate the use of multi-agent allocation techniques on problems related to Earth observation scenarios with multiple users and satellites. We focus on the problem of coordinating users having reserved exclusive orbit portions and one central planner having several requests that may use some intervals of these exclusives. We define this problem as Earth Observation Satellite Constellation Scheduling Problem (EOSCSP) and map it to a Mixed Integer Linear Program. As to solve EOSCSP, we propose market-based techniques and a distributed problem solving technique based on Distributed Constraint Optimization (DCOP), where agents cooperate to allocate requests without sharing their own schedules. These contributions are experimentally evaluated on randomly generated EOSCSP instances based on real large-scale or highly conflicting observation order books.

eoscsp, exclusive user, satellite, (14 more...)

2106.03548

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.82)

Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games

Feng, Xidong, Slumbers, Oliver, Yang, Yaodong, Wan, Ziyu, Liu, Bo, McAleer, Stephen, Wen, Ying, Wang, Jun

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population. Within such a process, the update rules of "who to compete with" (i.e., the opponent mixture) and "how to beat them" (i.e., finding best responses) are underpinned by manually developed game theoretical principles such as fictitious play and Double Oracle. In this paper we introduce a framework, LMAC, based on meta-gradient descent that automates the discovery of the update rule without explicit human design. Specifically, we parameterise the opponent selection module by neural networks and the best-response module by optimisation subroutines, and update their parameters solely via interaction with the game engine, where both players aim to minimise their exploitability. Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance with the state-of-the-art population-based game solvers (e.g., PSRO) on Games of Skill, differentiable Lotto, non-transitive Mixture Games, Iterated Matching Pennies, and Kuhn Poker. Additionally, we show that LMAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO on Leduc Poker. Our work inspires a promising future direction to discover general MARL algorithms solely from data.

arxiv preprint arxiv, oracle, umber, (13 more...)

2106.02745

Country:

North America > United States > California > Orange County > Irvine (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Emergent Communication of Generalizations

Mu, Jesse, Goodman, Noah

To build agents that can collaborate effectively with others, recent research has trained artificial agents to communicate with each other in Lewis-style referential games. However, this often leads to successful but uninterpretable communication. We argue that this is due to the game objective: communicating about a single object in a shared visual context is prone to overfitting and does not encourage language useful beyond concrete reference. In contrast, human language conveys a rich variety of abstract ideas. To promote such skills, we propose games that require communicating generalizations over sets of objects representing abstract visual concepts, optionally with separate contexts for each agent. We find that these games greatly improve systematicity and interpretability of the learned languages, according to several metrics in the literature. Finally, we propose a method for identifying logical operations embedded in the emergent languages by learning an approximate compositional reconstruction of the language.

agent, reference game, triangle, (16 more...)

2106.02668

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Zhou, Ming, Wan, Ziyu, Wang, Hanjing, Wen, Muning, Wu, Runzhe, Wen, Ying, Yang, Yaodong, Zhang, Weinan, Wang, Jun

Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms, which produces a self-generated sequence of tasks arising from the coupled population dynamics. By leveraging auto-curricula to induce a population of distinct emergent strategies, PB-MARL has achieved impressive success in tackling multi-agent tasks. Despite remarkable prior arts of distributed RL frameworks, PB-MARL poses new challenges for parallelizing the training frameworks due to the additional complexity of multiple nested workloads between sampling, training and evaluation involved with heterogeneous policy interactions. To solve these problems, we present MALib, a scalable and efficient computing framework for PB-MARL. Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms. Experiments on a series of complex tasks such as multi-agent Atari Games show that MALib achieves throughput higher than 40K FPS on a single machine with $32$ CPU cores; 5x speedup than RLlib and at least 3x speedup than OpenSpiel in multi-agent training tasks. MALib is publicly available at https://github.com/sjtu-marl/malib.

agent, algorithm, malib, (13 more...)

2106.07551

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Middle East > Jordan (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(2 more...)

Genre:

Workflow (0.88)
Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

arXiv.org Machine LearningJun-4-2021

Data-driven discovery of interacting particle systems using Gaussian processes

Feng, Jinchao, Ren, Yunxiang, Tang, Sui

Interacting particle or agent systems that display a rich variety of collection motions are ubiquitous in science and engineering. A fundamental and challenging goal is to understand the link between individual interaction rules and collective behaviors. In this paper, we study the data-driven discovery of distance-based interaction laws in second-order interacting particle systems. We propose a learning approach that models the latent interaction kernel functions as Gaussian processes, which can simultaneously fulfill two inference goals: one is the nonparametric inference of interaction kernel function with the pointwise uncertainty quantification, and the other one is the inference of unknown parameters in the non-collective forces of the system. We formulate learning interaction kernel functions as a statistical inverse problem and provide a detailed analysis of recoverability conditions, establishing that a coercivity condition is sufficient for recoverability. We provide a finite-sample analysis, showing that our posterior mean estimator converges at an optimal rate equal to the one in the classical 1-dimensional Kernel Ridge regression. Numerical results on systems that exhibit different collective behaviors demonstrate efficient learning of our approach from scarce noisy trajectory data.

estimator, operator, trajectory data, (15 more...)

arXiv.org Machine Learning

2106.02735

Country:

North America > United States > Oregon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)

Gamage, Chathura, Stephenson, Matthew, Pinto, Vimukthini, Renz, Jochen

Deceptive Level Generation for Angry Birds

arXiv.org Artificial IntelligenceJun-3-2021

The Angry Birds AI competition has been held over many years to encourage the development of AI agents that can play Angry Birds game levels better than human players. Many different agents with various approaches have been employed over the competition's lifetime to solve this task. Even though the performance of these agents has increased significantly over the past few years, they still show major drawbacks in playing deceptive levels. This is because most of the current agents try to identify the best next shot rather than planning an effective sequence of shots. In order to encourage advancements in such agents, we present an automated methodology to generate deceptive game levels for Angry Birds. Even though there are many existing content generators for Angry Birds, they do not focus on generating deceptive levels. In this paper, we propose a procedure to generate deceptive levels for six deception categories that can fool the state-of-the-art Angry Birds playing AI agents. Our results show that generated deceptive levels exhibit similar characteristics of human-created deceptive levels. Additionally, we define metrics to measure the stability, solvability, and degree of deception of the generated levels.

agent, artificial intelligence, machine learning, (16 more...)

2106.01639

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.54)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJun-3-2021

Modeling Communication to Coordinate Perspectives in Cooperation

Stacy, Stephanie, Li, Chenfei, Zhao, Minglu, Yun, Yiling, Zhao, Qingyi, Kleiman-Weiner, Max, Gao, Tao

Communication is highly overloaded. Despite this, even young children are good at leveraging context to understand ambiguous signals. We propose a computational account of overloaded signaling from a shared agency perspective which we call the Imagined We for Communication. Under this framework, communication helps cooperators coordinate their perspectives, allowing them to act together to achieve shared goals. We assume agents are rational cooperators, which puts constraints on how signals can be sent and interpreted. We implement this model in a set of simulations demonstrating this model's success under increasing ambiguity as well as increasing layers of reasoning. Our model is capable of improving performance with deeper recursive reasoning; however, it outperforms comparison baselines at even the shallowest level, highlighting how shared knowledge and cooperative logic can do much of the heavy-lifting in language.

communication, reasoning, receiver, (16 more...)

2106.02164

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceJun-3-2021

The Smoothed Satisfaction of Voting Axioms

Xia, Lirong

We initiate the work towards a comprehensive picture of the smoothed satisfaction of voting axioms, to provide a finer and more realistic foundation for comparing voting rules. We adopt the smoothed social choice framework, where an adversary chooses arbitrarily correlated "ground truth" preferences for the agents, on top of which random noises are added. We focus on characterizing the smoothed satisfaction of two well-studied voting axioms: Condorcet criterion and participation. We prove that for any fixed number of alternatives, when the number of voters $n$ is sufficiently large, the smoothed satisfaction of the Condorcet criterion under a wide range of voting rules is $1$, $1-\exp(-\Theta(n))$, $\Theta(n^{-0.5})$, $ \exp(-\Theta(n))$, or being $\Theta(1)$ and $1-\Theta(1)$ at the same time; and the smoothed satisfaction of participation is $1-\Theta(n^{-0.5})$. Our results address open questions by Berg and Lepelley in 1994 for these rules, and also confirm the following high-level message: the Condorcet criterion is a bigger concern than participation under realistic models.

condorcet winner, cww, denote, (17 more...)

2106.01947

Country:

North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government > Voting & Elections (0.70)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)