AITopics | Agent Societies

Collaborating Authors

Agent Societies

News Overviews Instructional Materials AI-Alerts Classics

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Mao, Weichao, Zhang, Kaiqing, Miehling, Erik, Başar, Tamer

arXiv.org Artificial IntelligenceApr-2-2020

Multi-agent reinforcement learning (MARL) under partial observability has long been considered challenging, primarily due to the requirement for each agent to maintain a belief over all other agents' local histories -- a domain that generally grows exponentially over time. In this work, we investigate a partially observable MARL problem in which agents are cooperative. To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents' histories. We quantify how the compression error influences the resulting value functions for decentralized control. Furthermore, we propose three natural embeddings, based on finite-memory truncation, principal component analysis, and recurrent neural networks. The output of these embeddings are then used as the information state, and can be fed into any MARL algorithm. The proposed embed-then-learn pipeline opens the black-box of existing MARL algorithms, allowing us to establish some theoretical guarantees (error bounds of value functions) while still achieving competitive performance with many end-to-end approaches.

agent, information, information state, (15 more...)

arXiv.org Artificial Intelligence

2004.01098

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.64)

Add feedback

Anytime and Efficient Coalition Formation with Spatial and Temporal Constraints

Capezzuto, Luca, Tarapore, Danesh, Ramchurn, Sarvapali D.

arXiv.org Artificial IntelligenceApr-2-2020

The Coalition Formation with Spatial and Temporal constraints Problem (CFSTP) is a multi-agent task allocation problem where the agents are cooperative and few, the tasks are many, spatially distributed, with deadlines and workloads, and the objective is to find a schedule that maximises the number of completed tasks. The current state-of-the-art CFSTP solver, the Coalition Formation with Look-Ahead (CFLA) algorithm, has two main limitations. First, its time complexity is quadratic with the number of tasks and exponential with the number of agents, which makes it not efficient. Second, its look-ahead technique is not effective in real-world scenarios, such as open multi-agent systems, where new tasks can appear at any time. Motivated by this, we propose an extension of CFLA, which we call Coalition Formation with Improved Look-Ahead (CFLA+). Since CFLA+ inherits the limitations of CFLA, we also develop a novel algorithm to solve the CFSTP, the first to be both anytime and efficient, which we call Cluster-based Coalition Formation (CCF). We empirically show that, in settings where the look-ahead technique is highly effective, CCF completes up to 20% (resp. 10%) more tasks than CFLA (resp. CFLA+) while being up to four orders of magnitude faster. Our results affirm CCF as the new state-of-the-art CFSTP solver.

agent, algorithm, cfla, (16 more...)

arXiv.org Artificial Intelligence

2003.13806

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.35)

Add feedback

Counterfactual Multi-Agent Reinforcement Learning with Graph Convolution Communication

Su, Jianyu, Adams, Stephen, Beling, Peter A.

arXiv.org Artificial IntelligenceApr-1-2020

We consider a fully cooperative multi-agent system where agents cooperate to maximize a system's utility in a partial-observable environment. We propose that multi-agent systems must have the ability to (1) communicate and understand the inter-plays between agents and (2) correctly distribute rewards based on an individual agent's contribution. In contrast, most work in this setting considers only one of the above abilities. In this study, we develop an architecture that allows for communication among agents and tailors the system's reward for each individual agent. Our architecture represents agent communication through graph convolution and applies an existing credit assignment structure, counterfactual multi-agent policy gradient (COMA), to assist agents to learn communication by back-propagation. The flexibility of the graph structure enables our method to be applicable to a variety of multi-agent systems, e.g. dynamic systems that consist of varying numbers of agents and static systems with a fixed number of agents. We evaluate our method on a range of tasks, demonstrating the advantage of marrying communication with credit assignment. In the experiments, our proposed method yields better performance than the state-of-art methods, including COMA. Moreover, we show that the communication strategies offers us insights and interpretability of the system's cooperative policies.

agent, communication, information, (15 more...)

arXiv.org Artificial Intelligence

2004.0047

Country: North America > United States > Virginia > Albemarle County > Charlottesville (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Mimicking Evolution with Reinforcement Learning

Abrantes, João P., Abrantes, Arnaldo J., Oliehoek, Frans A.

arXiv.org Artificial IntelligenceMar-31-2020

Evolution gave rise to human and animal intelligence here on Earth. We argue that the path to developing artificial human-like-intelligence will pass through mimicking the evolutionary process in a nature-like simulation. In Nature, there are two processes driving the development of the brain: evolution and learning. Evolution acts slowly, across generations, and amongst other things, it defines what agents learn by changing their internal reward function. Learning acts fast, across one's lifetime, and it quickly updates agents' policy to maximise pleasure and minimise pain. The reward function is slowly aligned with the fitness function by evolution, however, as agents evolve the environment and its fitness function also change, increasing the misalignment between reward and fitness. It is extremely computationally expensive to replicate these two processes in simulation. This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness by ensuring the alignment of the reward function with the fitness function. In this search, EvER makes use of the whole state-action trajectories that agents go through their lifetime. In contrast, current evolutionary algorithms discard this information and consequently limit their potential efficiency at tackling sequential decision problems. We test our algorithm in two simple bio-inspired environments and show its superiority at generating more capable agents at surviving and reproducing their genes when compared with a state-of-the-art evolutionary algorithm.

agent, algorithm, reward function, (11 more...)

arXiv.org Artificial Intelligence

2004.00048

Country:

Europe > United Kingdom > England > Essex > Colchester (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.47)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Researchers propose paradigm that trains AI agents through evolution

#artificialintelligenceMar-29-2020, 12:34:54 GMT

A paper published by researchers at Carnegie Mellon University, San Francisco research firm OpenAI, Facebook AI Research, the University of California at Berkeley, and Shanghai Jiao Tong University describes a paradigm that scales up multi-agent reinforcement learning, where AI models learn by having agents interact within an environment such that the agent population increases in size over time. By maintaining sets of agents in each training stage and performing mix-and-match and fine-tuning steps over these sets, the coauthors say the paradigm -- Evolutionary Population Curriculum -- is able to promote agents with the best adaptability to the next stage. In computer science, evolutionary computation is the family of algorithms for global optimization inspired by biological evolution. Instead of following explicit mathematical gradients, these models generate variants, test them, and retain the top performers. They've shown promise in early work by OpenAI, Google, Uber, and others, but they're somewhat tough to prototype because there's a dearth of tools targeting evolutionary algorithms and natural evolution strategies (NES).

agent, evolutionary population curriculum, researcher propose paradigm, (5 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.26)
Asia > China > Shanghai > Shanghai (0.26)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning

Liang, Yongyuan, Li, Bangwei

arXiv.org Artificial IntelligenceMar-29-2020

Multi-agent reinforcement learning is a standard framework for modeling multi-agent interactions applied in real-world scenarios. Inspired by experience sharing in human groups, learning knowledge parallel reusing between agents can potentially promote team learning performance, especially in multi-task environments. When all agents interact with the environment and learn simultaneously, how each independent agent selectively learns from other agents' behavior knowledge is a problem that we need to solve. This paper proposes a novel knowledge transfer framework in MARL, PAT (Parallel Attentional Transfer). We design two acting modes in PAT, student mode and self-learning mode. Each agent in our approach trains a decentralized student actor-critic to determine its acting mode at each time step. When agents are unfamiliar with the environment, the shared attention mechanism in student mode effectively selects learning knowledge from other agents to decide agents' actions. PAT outperforms state-of-the-art empirical evaluation results against the prior advising approaches. Our approach not only significantly improves team learning rate and global performance, but also is flexible and transferable to be applied in various multi-agent systems.

agent, knowledge, learning, (12 more...)

arXiv.org Artificial Intelligence

2003.13085

Country:

North America > United States (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.82)

Industry: Education (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)

Add feedback

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Sheikh, Hassam Ullah, Bölöni, Ladislau

arXiv.org Artificial IntelligenceMar-23-2020

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are designed to either maximize the global reward of the team or the individual local rewards. The problem is exacerbated when either of the rewards is sparse leading to unstable learning. To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to maximize the global and local rewards. We evaluate our solution on the challenging defensive escort team problem and show that our solution achieves a significantly better and more stable performance than the direct adaptation of the MADDPG algorithm.

agent, maddpg, maupg, (14 more...)

arXiv.org Artificial Intelligence

2003.10598

Country: North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Long, Qian, Zhou, Zihan, Gupta, Abhibav, Fang, Fei, Wu, Yi, Wang, Xiaolong

arXiv.org Artificial IntelligenceMar-23-2020

In multi-agent games, the complexity of the environment can grow exponentially as the number of agents increases, so it is particularly challenging to learn good policies when the agent population is large. In this paper, we introduce Evolutionary Population Curriculum (EPC), a curriculum learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner. Furthermore, EPC uses an evolutionary approach to fix an objective misalignment issue throughout the curriculum: agents successfully trained in an early stage with a small population are not necessarily the best candidates for adapting to later stages with scaled populations. Concretely, EPC maintains multiple sets of agents in each stage, performs mix-and-match and fine-tuning over these sets and promotes the sets of agents with the best adaptability to the next stage. We implement EPC on a popular MARL algorithm, MADDPG, and empirically show that our approach consistently outperforms baselines by a large margin as the number of agents grows exponentially. The project page is https://sites.google.com/view/epciclr2020.

agent, conference paper, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2003.10423

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

Carion, Nicolas, Usunier, Nicolas, Synnaeve, Gabriel, Lazaric, Alessandro

Neural Information Processing SystemsMar-18-2020, 23:48:18 GMT

Effective coordination is crucial to solve multi-agent collaborative (MAC) problems. While centralized reinforcement learning methods can optimally solve small MAC instances, they do not scale to large problems and they fail to generalize to scenarios different from those seen during training. In this paper, we consider MAC problems with some intrinsic notion of locality (e.g., geographic proximity) such that interactions between agents and tasks are locally limited. By leveraging this property, we introduce a novel structured prediction approach to assign agents to tasks. At each step, the assignment is obtained by solving a centralized optimization problem (the inference procedure) whose objective function is parameterized by a learned scoring model.

cooperative multi-agent reinforcement learning, generalization, structured prediction approach, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control

de Witt, Christian Schroeder, Peng, Bei, Kamienny, Pierre-Alexandre, Torr, Philip, Böhmer, Wendelin, Whiteson, Shimon

arXiv.org Artificial IntelligenceMar-18-2020

Deep multi-agent reinforcement learning (MARL) holds the promise of automating many real-world cooperative robotic manipulation and transportation tasks. Nevertheless, decentralised cooperative robotic control has received less attention from the deep reinforcement learning community, as compared to single-agent robotics and multi-agent games with discrete actions. To address this gap, this paper introduces Multi-Agent Mujoco, an easily extensible multi-agent benchmark suite for robotic control in continuous action spaces. The benchmark tasks are diverse and admit easily configurable partially observable settings. Inspired by the success of single-agent continuous value-based algorithms in robotic control, we also introduce COMIX, a novel extension to a common discrete action multi-agent $Q$-learning algorithm. We show that COMIX significantly outperforms state-of-the-art MADDPG on a partially observable variant of a popular particle environment and matches or surpasses it on Multi-Agent Mujoco. Thanks to this new benchmark suite and method, we can now pose an interesting question: what is the key to performance in such settings, the use of value-based methods instead of policy gradients, or the factorisation of the joint $Q$-function? To answer this question, we propose a second new method, FacMADDPG, which factors MADDPG's critic. Experimental results on Multi-Agent Mujoco suggest that factorisation is the key to performance.

agent, deep multi-agent reinforcement learning, multi-agent mujoco, (10 more...)

arXiv.org Artificial Intelligence

2003.06709

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Montana (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Denmark (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)

Add feedback