AITopics | Agent Societies

Collaborating Authors

Agent Societies

News Overviews Instructional Materials AI-Alerts Classics

A Review of Cooperative Multi-Agent Deep Reinforcement Learning

OroojlooyJadid, Afshin, Hajinezhad, Davood

arXiv.org Artificial IntelligenceAug-11-2019

Deep Reinforcement Learning has made significant progress in multi-agent systems in recent years. In this review article, we have mostly focused on recent papers on Multi-Agent Reinforcement Learning (MARL) than the older papers, unless it was necessary. Several ideas and papers are proposed with different notations, and we tried our best to unify them with a single notation and categorize them by their relevance. In particular, we have focused on five common approaches on modeling and solving multi-agent reinforcement learning problems: (I) independent-learners, (II) fully observable critic, (III) value function decomposition, (IV) consensus, (IV) learn to communicate. Moreover, we discuss some new emerging research areas in MARL along with the relevant recent papers. In addition, some of the recent applications of MARL in real world are discussed. Finally, a list of available environments for MARL research are provided and the paper is concluded with proposals on the possible research directions.

agent, algorithm, learning, (13 more...)

arXiv.org Artificial Intelligence

1908.03963

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Transportation (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Large-scale Traffic Signal Control Using a Novel Multi-Agent Reinforcement Learning

Wang, Xiaoqiang, Ke, Liangjun, Qiao, Zhimin, Chai, Xinghua

arXiv.org Machine LearningAug-10-2019

Finding the optimal signal timing strategy is a difficult task for the problem of large-scale traffic signal control (TSC). Multi-Agent Reinforcement Learning (MARL) is a promising method to solve this problem. However, there is still room for improvement in extending to large-scale problems and modeling the behaviors of other agents for each individual agent. In this paper, a new MARL, called Cooperative double Q-learning (Co-DQL), is proposed, which has several prominent features. It uses a highly scalable independent double Q-learning method based on double estimators and the UCB policy, which can eliminate the over-estimation problem existing in traditional independent Q-learning while ensuring exploration. It uses mean field approximation to model the interaction among agents, thereby making agents learn a better cooperative strategy. In order to improve the stability and robustness of the learning process, we introduce a new reward allocation mechanism and a local state sharing method. In addition, we analyze the convergence properties of the proposed algorithm. Co-DQL is applied on TSC and tested on a multi-traffic signal simulator. According to the results obtained on several traffic scenarios, Co- DQL outperforms several state-of-the-art decentralized MARL algorithms. It can effectively shorten the average waiting time of the vehicles in the whole road system.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1908.03761

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.84)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Global Collaboration Tools Market 2019-2023 - Evolution of Artificial Intelligence Bots

#artificialintelligenceAug-4-2019, 10:30:48 GMT

Rising Need for Better Product Management 4.2 Challenges 4.2.1 Slow Adoption Rate 4.2.2

application market, artificial intelligence, global collaboration tool market, (9 more...)

#artificialintelligence

Genre: Press Release (1.00)

Industry:

Information Technology > Security & Privacy (0.53)
Media > News (0.40)

Technology:

Information Technology > Communications > Collaboration (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.57)

Add feedback

Action Semantics Network: Considering the Effects of Actions in Multiagent Systems

Wang, Weixun, Liu, Tianpei Yang Yong, Hao, Jianye, Hao, Xiaotian, Hu, Yujing, Chen, Yingfeng, Fan, Changjie, Gao, Yang

arXiv.org Artificial IntelligenceJul-26-2019

In multiagent systems (MASs), each agent makes individual decisions but all of them contribute globally to the system evolution. Learning in MASs is difficult since the selection of actions must take place in the presence of other co-learning agents. Moreover, the environmental stochasticity and uncertainties increase exponentially with the number of agents. A number of previous works borrow various multiagent coordination mechanisms into deep multiagent learning architecture to facilitate multiagent coordination. However, none of them explicitly consider action semantics between agents. In this paper, we propose a novel network architecture, named Action Semantics Network (ASN), that explicitly represents such action semantics between agents. ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between agents. ASN can be easily combined with existing deep reinforcement learning (DRL) algorithms to boost their performance. Experimental results on StarCraft II and Neural MMO show ASN significantly improves the performance of state-of-the-art DRL approaches compared with a number of network architectures.

action semantic, agent, artificial intelligence, (18 more...)

arXiv.org Artificial Intelligence

1907.11461

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.90)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Add feedback

Arena: a toolkit for Multi-Agent Reinforcement Learning

Wang, Qing, Xiong, Jiechao, Han, Lei, Fang, Meng, Sun, Xinghai, Zheng, Zhuobin, Sun, Peng, Zhang, Zhengyou

arXiv.org Artificial IntelligenceJul-20-2019

We introduce Arena, a toolkit for multi-agent reinforcement learning (MARL) research. In MARL, it usually requires customizing observations, rewards and actions for each agent, changing cooperative-competitive agent-interaction, and playing with/against a third-party agent, etc. We provide a novel modular design, called Interface, for manipulating such routines in essentially two ways: 1) Different interfaces can be concatenated and combined, which extends the OpenAI Gym Wrappers concept to MARL scenarios. 2) During MARL training or testing, interfaces can be embedded in either wrapped OpenAI Gym compatible Environments or raw environment compatible Agents. We offer off-the-shelf interfaces for several popular MARL platforms, including StarCraft II, Pommerman, ViZDoom, Soccer, etc. The interfaces effectively support self-play RL and cooperative-competitive hybrid MARL. Also, Arena can be conveniently extended to your own favorite MARL platform.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

1907.09467

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Interpretable Modelling of Driving Behaviors in Interactive Driving Scenarios based on Cumulative Prospect Theory

Sun, Liting, Zhan, Wei, Hu, Yeping, Tomizuka, Masayoshi

arXiv.org Artificial IntelligenceJul-19-2019

Understanding human driving behavior is important for autonomous vehicles. In this paper, we propose an interpretable human behavior model in interactive driving scenarios based on the cumulative prospect theory (CPT). As a non-expected utility theory, CPT can well explain some systematically biased or ``irrational'' behavior/decisions of human that cannot be explained by the expected utility theory. Hence, the goal of this work is to formulate the human drivers' behavior generation model with CPT so that some ``irrational'' behavior or decisions of human can be better captured and predicted. Towards such a goal, we first develop a CPT-driven decision-making model focusing on driving scenarios with two interacting agents. A hierarchical learning algorithm is proposed afterward to learn the utility function, the value function, and the decision weighting function in the CPT model. A case study for roundabout merging is also provided as verification. With real driving data, the prediction performances of three different models are compared: a predefined model based on time-to-collision (TTC), a learning-based model based on neural networks, and the proposed CPT-based model. The results show that the proposed model outperforms the TTC model and achieves similar performance as the learning-based model with much less training data and better interpretability.

target vehicle, trajectory, vehicle, (16 more...)

arXiv.org Artificial Intelligence

1907.08707

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (0.49)
Transportation > Infrastructure & Services (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback

Prioritized Guidance for Efficient Multi-Agent Reinforcement Learning Exploration

Wang, Qisheng, Wang, Qichao

arXiv.org Machine LearningJul-19-2019

Exploration efficiency is a challenging problem in multi-agent reinforcement learning (MARL), as the policy learned by confederate MARL depends on the collaborative approach among multiple agents. Another important problem is the less informative reward restricts the learning speed of MARL compared with the informative label in supervised learning. In this work, we leverage on a novel communication method to guide MARL to accelerate exploration and propose a predictive network to forecast the reward of current state-action pair and use the guidance learned by the predictive network to modify the reward function. An improved prioritized experience replay is employed to better take advantage of the different knowledge learned by different agents which utilizes Time-difference (TD) error more effectively. Experimental results demonstrates that the proposed algorithm outperforms existing methods in cooperative multi-agent environments. We remark that this algorithm can be extended to supervised learning to speed up its training.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1907.07847

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Measuring the Vulnerability of a Multi-Agent Pathfinding Solution

Yoeli, Rotem (Ben-Gurion University of the Negev) | Stern, Roni (Ben-Gurion University of the Negev) | Atzmon, Dor (Ben-Gurion University of the Negev)

AAAI ConferencesJul-11-2019

Multi-agent pathfinding is the problem of finding a non-interfering paths for a set of agents, such that if the agents follow these paths then each agent will reach its desired destination. Recent years have shown tremendous advances in this field, with optimal and suboptimal algorithms that are able to plan paths for over 100 agents in reasonable time. However, autonomous mobile agents are prime targets for cyber-security attacks, where an adversary may take control over an agent to disrupt the agents execution of their plan. This threat raises two questions. The first question is how much damage can an agent do if it does not follow its plan. The second question is how can one plan a-priori to be as robust as possible to such cyber-attacks. In this work, We provide an answer to both questions. To compute the maximal amount of damage that an adversary agent can do, we define a corresponding graph search problem and solve this problem with A*. Then, we provide a very simple method to choose a solution that is robust to such damages. We demonstrate both algorithms in simulation over standard multi-agent pathfinding domains.

abnormal action, agent, malicious entity, (13 more...)

AAAI Conferences

Twelfth Annual Symposium on Combinatorial Search

Country: Asia > Middle East > Israel > Southern District > Beer-Sheva (0.05)

Industry: Information Technology > Security & Privacy (0.88)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.36)

Add feedback

Generalized Target Assignment and Path Finding Using Answer Set Programming

Nguyen, Van (New Mexico State University) | Obermeier, Philipp (University of Postdam) | Son, Tran Cao (New Mexico State University) | Schaub, Torsten (Washington University in St. Louis) | Yeoh, William (Washington University in St. Louis)

AAAI ConferencesJul-11-2019

In Multi-Agent Path Finding (MAPF), a team of agents needs to find collision-free paths from their starting locations to their respective targets. Combined Target Assignment and Path Finding (TAPF) extends MAPF by including the problem of assigning targets to agents as a precursor to the MAPF problem. A limitation of both models is their assumption that the number of agents and targets are equal, which is invalid in some applications. We address this limitation by generalizing TAPF to allow for (1) unequal number of agents and tasks; (2) tasks to have deadlines by which they must be completed; (3) ordering of groups of tasks to be completed; and (4) tasks that are composed of a sequence of checkpoints that must be visited in a specific order. Further, we model the problem using answer set programming (ASP) to show that customizing the desired variant of the problem is simple -- one only needs to choose the appropriate combination of ASP rules to enforce it. We also demonstrate experimentally that if problem specific information can be incorporated into the ASP encoding then ASP based methods can be efficient and can scale up to solve practical applications.

agent, checkpoint, deadline, (12 more...)

AAAI Conferences

Twelfth Annual Symposium on Combinatorial Search

Country:

North America > United States > New Mexico (0.05)
Europe > Germany > Brandenburg > Potsdam (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.56)

Add feedback

An Improved Algorithm for Optimal Coalition Structure Generation

Changder, Narayan (National Institute of Technology) | Aknine, Samir (University Lyon 1) | Dutta, Animesh (National Institute of Technology)

AAAI ConferencesJul-11-2019

The Coalition Structure Generation (CSG) problem is a partitioning of a set of agents into exhaustive and disjoint coalitions to maximize social welfare. This NP-complete problem arises in many practical scenarios. Prominent examples are included in the field of transportation, e-Commerce, distributed sensor networks, and others. The fastest exact algorithm to solve the CSG problem is ODP-IP, which is a hybrid version of two previously established algorithms, namely Improved Dynamic Programming (IDP) and IP. In this paper, we show that the ODP-IP algorithm performs many redundant operations. To improve ODP-IP, we propose a faster abortion mechanism to speed up IP’s search. Our abortion mechanism decides at runtime which of the IP's operations are redundant to skip them. Then, we propose a modified version of IDP (named MIDP) and an improved version of IP (named IIP). Based on these two improved algorithms, we develop a hybrid version (MIDP-IIP) to solve the CSG problem. After a detailed description of the new algorithm MIDP-IIP, an experimental comparison is conducted against ODP-IP. Our analysis shows that MIDP-IIP performs fewer operations than ODP-IP. In addition, MIDP-IIP reduced significantly many problem instances running times (11% to 37 %), and improved drastically some of them.

algorithm, coalition, coalition structure, (11 more...)

AAAI Conferences

Twelfth Annual Symposium on Combinatorial Search

Country:

Asia > India (0.05)
Europe > France (0.05)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Add feedback