AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget

Wijmans, Erik, Essa, Irfan, Batra, Dhruv

arXiv.org Artificial IntelligenceDec-10-2020

PointGoal navigation has seen significant recent interest and progress, spurred on by the Habitat platform and associated challenge. In this paper, we study PointGoal navigation under both a sample budget (75 million frames) and a compute budget (1 GPU for 1 day). We conduct an extensive set of experiments, cumulatively totaling over 50,000 GPU-hours, that let us identify and discuss a number of ostensibly minor but significant design choices -- the advantage estimation procedure (a key component in training), visual encoder architecture, and a seemingly minor hyper-parameter change. Overall, these design choices to lead considerable and consistent improvements over the baselines present in Savva et al. Under a sample budget, performance for RGB-D agents improves 8 SPL on Gibson (14% relative improvement) and 20 SPL on Matterport3D (38% relative improvement). Under a compute budget, performance for RGB-D agents improves by 19 SPL on Gibson (32% relative improvement) and 35 SPL on Matterport3D (220% relative improvement). We hope our findings and recommendations will make serve to make the community's experiments more efficient.

agent, efficiency, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2012.06117

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search

Huang, Taoan, Dilkina, Bistra, Koenig, Sven

arXiv.org Artificial IntelligenceDec-10-2020

Conflict-Based Search (CBS) is a state-of-the-art algorithm for multi-agent path finding. At the high level, CBS repeatedly detects conflicts and resolves one of them by splitting the current problem into two subproblems. Previous work chooses the conflict to resolve by categorizing the conflict into three classes and always picking a conflict from the highest-priority class. In this work, we propose an oracle for conflict selection that results in smaller search tree sizes than the one used in previous work. However, the computation of the oracle is slow. Thus, we propose a machine-learning framework for conflict selection that observes the decisions made by the oracle and learns a conflict-selection strategy represented by a linear ranking function that imitates the oracle's decisions accurately and quickly. Experiments on benchmark maps indicate that our method significantly improves the success rates, the search tree sizes and runtimes over the current state-of-the-art CBS solver.

agent, conflict, ranking function, (13 more...)

arXiv.org Artificial Intelligence

2012.06005

Country:

North America > United States > California (0.14)
North America > United States > Texas > Tarrant County > Fort Worth (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Persuading Voters in District-based Elections

Castiglioni, Matteo, Gatti, Nicola

arXiv.org Artificial IntelligenceDec-10-2020

We focus on the scenario in which an agent can exploit his information advantage to manipulate the outcome of an election. In particular, we study district-based elections with two candidates, in which the winner of the election is the candidate that wins in the majority of the districts. District-based elections are adopted worldwide (e.g., UK and USA) and are a natural extension of widely studied voting mechanisms (e.g., k-voting and plurality voting). We resort to the Bayesian persuasion framework, where the manipulator (sender) strategically discloses information to the voters (receivers) that update their beliefs rationally. We study both private signaling, in which the sender can use a private communication channel per receiver, and public signaling, in which the sender can use a single communication channel for all the receivers. Furthermore, for the first time, we introduce semi-public signaling in which the sender can use a single communication channel per district. We show that there is a sharp distinction between private and (semi-)public signaling. In particular, optimal private signaling schemes can provide an arbitrarily better probability of victory than (semi-)public ones and can be computed efficiently, while optimal (semi-)public signaling schemes cannot be approximated to within any factor in polynomial time unless P=NP. However, we show that reasonable relaxations allow the design of multi-criteria PTASs for optimal (semi-)public signaling schemes. In doing so, we introduce a novel property, namely comparative stability, and we design a bi-criteria PTAS for public signaling in general Bayesian persuasion problems beyond elections when the sender's utility function is state-dependent.

probability, receiver, sender, (16 more...)

arXiv.org Artificial Intelligence

2012.05002

Country:

Europe > Kosovo > District of Gjilan > Kamenica (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government > Voting & Elections (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

From particle swarm optimization to consensus based optimization: stochastic modeling and mean-field limit

Grassi, Sara, Pareschi, Lorenzo

arXiv.org Machine LearningDec-10-2020

In this paper we consider a continuous description based on stochastic differential equations of the popular particle swarm optimization (PSO) process for solving global optimization problems and derive in the large particle limit the corresponding mean-field approximation based on Vlasov-Fokker-Planck-type equations. The disadvantage of memory effects induced by the need to store the local best position is overcome by the introduction of an additional differential equation describing the evolution of the local best. A regularization process for the global best permits to formally derive the respective mean-field description. Subsequently, in the small inertia limit, we compute the related macroscopic hydrodynamic equations that clarify the link with the recently introduced consensus based optimization (CBO) methods. Several numerical examples illustrate the mean field process, the small inertia limit and the potential of this general class of global optimization methods.

ackley function, equation, optimization, (13 more...)

arXiv.org Machine Learning

2012.05613

Country:

Europe > Italy (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Congratulations to the #NeurIPS2020 award winners

AIHubDec-8-2020, 14:23:40 GMT

The winners of the NeurIPS 2020 awards have been announced. This year, three papers have received Best Paper Awards. There was also one Test of Time Award; this recognises a paper that has had significant and lasting impact on the community. No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium Andrea Celli, Alberto Marchesi, Gabriele Farina, Nicola Gatti Abstract: The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium.

dataset, neurips2020 award winner, normal-form game, (14 more...)

AIHub

Genre: Personal > Honors > Award (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game Theory

Adams, Griffin, Padmanabhan, Sarguna Janani, Shekhar, Shivang

arXiv.org Artificial IntelligenceDec-8-2020

We address two major challenges of implicit coordination in multi-agent deep reinforcement learning: non-stationarity and exponential growth of state-action space, by combining Deep-Q Networks for policy learning with Nash equilibrium for action selection. Q-values proxy as payoffs in Nash settings, and mutual best responses define joint action selection. Coordination is implicit because multiple/no Nash equilibria are resolved deterministically. We demonstrate that knowledge of game type leads to an assumption of mirrored best responses and faster convergence than Nash-Q. Specifically, the Friend-or-Foe algorithm demonstrates signs of convergence to a Set Controller which jointly chooses actions for two agents. This encouraging given the highly unstable nature of decentralized coordination over joint actions. Inspired by the dueling network architecture, which decouples the Q-function into state and advantage streams, as well as residual networks, we learn both a single and joint agent representation, and merge them via element-wise addition. This simplifies coordination by recasting it is as learning a residual function. We also draw high level comparative insights on key MADRL and game theoretic variables: competitive vs. cooperative, asynchronous vs. parallel learning, greedy versus socially optimal Nash equilibria tie breaking, and strategies for the no Nash equilibrium case. We evaluate on 3 custom environments written in Python using OpenAI Gym: a Predator Prey environment, an alternating Warehouse environment, and a Synchronization environment. Each environment requires successively more coordination to achieve positive rewards.

agent, architecture, coordination, (15 more...)

arXiv.org Artificial Intelligence

2012.09136

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

EvoCraft: A New Challenge for Open-Endedness

Grbic, Djordje, Palm, Rasmus Berg, Najarro, Elias, Glanois, Claire, Risi, Sebastian

arXiv.org Artificial IntelligenceDec-8-2020

This paper introduces EvoCraft, a framework for Minecraft designed to study open-ended algorithms. We introduce an API that provides an open-source Python interface for communicating with Minecraft to place and track blocks. In contrast to previous work in Minecraft that focused on learning to play the game, the grand challenge we pose here is to automatically search for increasingly complex artifacts in an open-ended fashion. Compared to other environments used to study open-endedness, Minecraft allows the construction of almost any kind of structure, including actuated machines with circuits and mechanical components. We present initial baseline results in evolving simple Minecraft creations through both interactive and automated evolution. While evolution succeeds when tasked to grow a structure towards a specific target, it is unable to find a solution when rewarded for creating a simple machine that moves. Thus, EvoCraft offers a challenging new environment for automated search methods (such as evolution) to find complex artifacts that we hope will spur the development of more open-ended algorithms.

evocraft, evolution, minecraft, (14 more...)

arXiv.org Artificial Intelligence

2012.04751

Country:

Europe > Sweden > Skåne County > Malmö (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Games (1.00)

Add feedback

Low-Bandwidth Communication Emerges Naturally in Multi-Agent Learning Systems

Grupen, Niko A., Lee, Daniel D., Selman, Bart

arXiv.org Artificial IntelligenceDec-8-2020

In this work, we study emergent communication through the lens of cooperative multi-agent behavior in nature. Using insights from animal communication, we propose a spectrum from low-bandwidth (e.g. pheromone trails) to high-bandwidth (e.g. compositional language) communication that is based on the cognitive, perceptual, and behavioral capabilities of social agents. Through a series of experiments with pursuit-evasion games, we identify multi-agent reinforcement learning algorithms as a computational model for the low-bandwidth end of the communication spectrum.

communication, predator, prey, (15 more...)

arXiv.org Artificial Intelligence

2011.1489

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Oceania > New Zealand (0.04)
Indian Ocean > Red Sea (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Add feedback

Lifted Bayesian Filtering in Multiset Rewriting Systems

Lüdtke, Stefan (University of Rostock) | Kirste, Thomas (University of Rostock)

Journal of Artificial Intelligence ResearchDec-7-2020

We present a model for Bayesian filtering (BF) in discrete dynamic systems where multiple entities (inter)-act, i.e. where the system dynamics is naturally described by a Multiset rewriting system (MRS). Typically, BF in such situations is computationally expensive due to the high number of discrete states that need to be maintained explicitly. We devise a lifted state representation, based on a suitable decomposition of multiset states, such that some factors of the distribution are exchangeable and thus afford an efficient representation. Intuitively, this representation groups together similar entities whose properties follow an exchangeable joint distribution. Subsequently, we introduce a BF algorithm that works directly on lifted states, without resorting to the original, much larger ground representation. This algorithm directly lends itself to approximate versions by limiting the number of explicitly represented lifted states in the posterior. We show empirically that the lifted representation can lead to a factorial reduction in the representational complexity of the distribution, and in the approximate cases can lead to a lower variance of the estimate and a lower estimation error compared to the original, ground representation.

probability, representation, scenario, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12066

AI Access Foundation

12066

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
Europe > Italy (0.04)
Europe > Germany (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
(5 more...)

Add feedback

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

Wani, Saim, Patel, Shivansh, Jain, Unnat, Chang, Angel X., Savva, Manolis

arXiv.org Artificial IntelligenceDec-7-2020

Navigation tasks in photorealistic 3D environments are challenging because they require perception and effective planning under partial observability. Recent work shows that map-like memory is useful for long-horizon navigation tasks. However, a focused investigation of the impact of maps on navigation tasks of varying complexity has not yet been performed. We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment. MultiON generalizes the ObjectGoal navigation task and explicitly tests the ability of navigation agents to locate previously observed goal objects. We perform a set of multiON experiments to examine how a variety of agent models perform across a spectrum of navigation task complexities. Our experiments show that: i) navigation performance degrades dramatically with escalating task complexity; ii) a simple semantic map agent performs surprisingly well relative to more complex neural image feature map agents; and iii) even oracle map agents achieve relatively low performance, indicating the potential for future work in training embodied navigation agents using maps. Video summary: https://youtu.be/yqTlHNIcgnY

agent, information, navigation, (14 more...)

arXiv.org Artificial Intelligence

2012.03912

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback