AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Li, Zhenning, Yu, Hao, Zhang, Guohui, Dong, Shangjia, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceApr-20-2021

Inefficient traffic control may cause numerous problems such as traffic congestion and energy waste. This paper proposes a novel multi-agent reinforcement learning method, named KS-DDPG (Knowledge Sharing Deep Deterministic Policy Gradient) to achieve optimal control by enhancing the cooperation between traffic signals. By introducing the knowledge-sharing enabled communication protocol, each agent can access to the collective representation of the traffic environment collected by all agents. The proposed method is evaluated through two experiments respectively using synthetic and real-world datasets. The comparison with state-of-the-art reinforcement learning-based and conventional transportation methods demonstrate the proposed KS-DDPG has significant efficiency in controlling large-scale transportation networks and coping with fluctuations in traffic flow. In addition, the introduced communication mechanism has also been proven to speed up the convergence of the model without significantly increasing the computational burden.

agent, intersection, knowledge, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.trc.2021.103059

2104.09936

Country:

Asia > Macao (0.14)
North America > United States > Maryland > Montgomery County (0.04)
North America > United States > Hawaii (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Learning to Communicate with Strangers via Channel Randomisation Methods

Cope, Dylan, Schoots, Nandi

arXiv.org Artificial IntelligenceApr-19-2021

We introduce two methods for improving the performance of agents meeting for the first time to accomplish a communicative task. The methods are: (1) `message mutation' during the generation of the communication protocol; and (2) random permutations of the communication channel. These proposals are tested using a simple two-player game involving a `teacher' who generates a communication protocol and sends a message, and a `student' who interprets the message. After training multiple agents via self-play we analyse the performance of these agents when they are matched with a stranger, i.e. their zero-shot communication performance. We find that both message mutation and channel permutation positively influence performance, and we discuss their effects.

agent, communication, protocol, (13 more...)

arXiv.org Artificial Intelligence

2104.09557

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior

Nahian, Md Sultan Al, Frazier, Spencer, Harrison, Brent, Riedl, Mark

arXiv.org Artificial IntelligenceApr-19-2021

As more machine learning agents interact with humans, it is increasingly a prospect that an agent trained to perform a task optimally, using only a measure of task performance as feedback, can violate societal norms for acceptable behavior or cause harm. Value alignment is a property of intelligent agents wherein they solely pursue non-harmful behaviors or human-beneficial goals. We introduce an approach to value-aligned reinforcement learning, in which we train an agent with two reward signals: a standard task performance reward, plus a normative behavior reward. The normative behavior reward is derived from a value-aligned prior model previously shown to classify text as normative or non-normative. We show how variations on a policy shaping technique can balance these two sources of reward and produce policies that are both effective and perceived as being more normative. We test our value-alignment technique on three interactive text-based worlds; each world is designed specifically to challenge agents with a task as well as provide opportunities to deviate from the task to engage in normative and/or altruistic behavior.

agent, environmental reward, reinforcement, (17 more...)

arXiv.org Artificial Intelligence

2104.09469

Country: North America > United States > Kentucky (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Agent-Centric Representations for Multi-Agent Reinforcement Learning

Shang, Wenling, Espeholt, Lasse, Raichuk, Anton, Salimans, Tim

arXiv.org Artificial IntelligenceApr-19-2021

Object-centric representations have recently enabled significant progress in tackling relational reasoning tasks. By building a strong object-centric inductive bias into neural architectures, recent efforts have improved generalization and data efficiency of machine learning algorithms for these problems. One problem class involving relational reasoning that still remains under-explored is multi-agent reinforcement learning (MARL). Here we investigate whether object-centric representations are also beneficial in the fully cooperative MARL setting. Specifically, we study two ways of incorporating an agent-centric inductive bias into our RL algorithm: 1. Introducing an agent-centric attention module with explicit connections across agents 2. Adding an agent-centric unsupervised predictive objective (i.e. not using action labels), to be used as an auxiliary loss for MARL, or as the basis of a pre-training step. We evaluate these approaches on the Google Research Football environment as well as DeepMind Lab 2D. Empirically, agent-centric representation learning leads to the emergence of more complex cooperation strategies between agents as well as enhanced sample efficiency and generalization.

agent, arxiv preprint arxiv, auxiliary loss, (13 more...)

arXiv.org Artificial Intelligence

2104.09402

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Constraints Satisfiability Driven Reinforcement Learning for Autonomous Cyber Defense

Dutta, Ashutosh, Al-Shaer, Ehab, Chatterjee, Samrat

arXiv.org Artificial IntelligenceApr-18-2021

With the increasing system complexity and attack sophistication, the necessity of autonomous cyber defense becomes vivid for cyber and cyber-physical systems (CPSs). Many existing frameworks in the current state-of-the-art either rely on static models with unrealistic assumptions, or fail to satisfy the system safety and security requirements. In this paper, we present a new hybrid autonomous agent architecture that aims to optimize and verify defense policies of reinforcement learning (RL) by incorporating constraints verification (using satisfiability modulo theory (SMT)) into the agent's decision loop. The incorporation of SMT does not only ensure the satisfiability of safety and security requirements, but also provides constant feedback to steer the RL decision-making toward safe and effective actions. This approach is critically needed for CPSs that exhibit high risk due to safety or security violations. Our evaluation of the presented approach in a simulated CPS environment shows that the agent learns the optimal policy fast and defeats diversified attack strategies in 99\% cases.

agent, constraint, requirement, (11 more...)

arXiv.org Artificial Intelligence

2104.08994

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > North Carolina > Mecklenburg County > Charlotte (0.14)
North America > United States > Washington > Benton County > Richland (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Add feedback

Revisiting the Complexity Analysis of Conflict-Based Search: New Computational Techniques and Improved Bounds

Gordon, Ofir, Filmus, Yuval, Salzman, Oren

arXiv.org Artificial IntelligenceApr-18-2021

The problem of Multi-Agent Path Finding (MAPF) calls for finding a set of conflict-free paths for a fleet of agents operating in a given environment. Arguably, the state-of-the-art approach to computing optimal solutions is Conflict-Based Search (CBS). In this work we revisit the complexity analysis of CBS to provide tighter bounds on the algorithm's run-time in the worst-case. Our analysis paves the way to better pinpoint the parameters that govern (in the worst case) the algorithm's computational complexity. Our analysis is based on two complementary approaches: In the first approach we bound the run-time using the size of a Multi-valued Decision Diagram (MDD) -- a layered graph which compactly contains all possible single-agent paths between two given vertices for a specific path length. In the second approach we express the running time by a novel recurrence relation which bounds the algorithm's complexity. We use generating functions-based analysis in order to tightly bound the recurrence. Using these technique we provide several new upper-bounds on CBS's complexity. The results allow us to improve the existing bound on the running time of CBS for many cases. For example, on a set of common benchmarks we improve the upper-bound by a factor of at least $2^{10^{7}}$.

cbs, complexity, constraint, (14 more...)

arXiv.org Artificial Intelligence

2104.08759

Country: Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Leisure & Entertainment (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Non-monotonic Value Function Factorization for Deep Multi-Agent Reinforcement Learning

Chen, Quanlin

arXiv.org Artificial IntelligenceApr-18-2021

In this paper, we propose actor-critic approaches by introducing an actor policy on QMIX [9], which can remove the monotonicity constraint of QMIX and implement a non-monotonic value function factorization for joint action-value. We evaluate our actor-critic methods on StarCraft II micromanagement tasks, and show that it has a stronger performance on maps with heterogeneous agent types.

agent, deep multi-agent reinforcement learning, monotonicity constraint, (10 more...)

arXiv.org Artificial Intelligence

2104.01939

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.37)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Amanda Prorok's talk – Learning to Communicate in Multi-Agent Systems (with video)

RobohubApr-17-2021, 09:30:37 GMT

In this technical talk, Amanda Prorok, Assistant Professor in the Department of Computer Science and Technology at Cambridge University, and a Fellow of Pembroke College, discusses her team's latest research on what, how and when information needs to be shared among agents that aim to solve cooperative tasks. Effective communication is key to successful multi-agent coordination. Yet it is far from obvious what, how and when information needs to be shared among agents that aim to solve cooperative tasks. In this talk, I discuss our recent work on using Graph Neural Networks (GNNs) to solve multi-agent coordination problems. In my first case-study, I show how we use GNNs to find a decentralized solution to the multi-agent path finding problem, which is known to be NP-hard.

amanda prorok, computer science, multi-agent system, (13 more...)

Robohub

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
North America > United States > Pennsylvania (0.06)
Europe > Switzerland (0.06)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

A Robust Model for Trust Evaluation during Interactions between Agents in a Sociable Environment

Liang, Qin, Zhang, Minjie, Ren, Fenghui, Ito, Takayuki

arXiv.org Artificial IntelligenceApr-17-2021

Trust evaluation is an important topic in both research and applications in sociable environments. This paper presents a model for trust evaluation between agents by the combination of direct trust, indirect trust through neighbouring links and the reputation of an agent in the environment (i.e. social network) to provide the robust evaluation. Our approach is typology independent from social network structures and in a decentralized manner without a central controller, so it can be used in broad domains.

evaluation, interaction, trustee, (15 more...)

arXiv.org Artificial Intelligence

2104.08555

Country:

Oceania > Australia (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Services (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Communications > Social Media (0.72)

Add feedback

Planning with Expectation Models for Control

Kudashkina, Katya, Wan, Yi, Naik, Abhishek, Sutton, Richard S.

arXiv.org Artificial IntelligenceApr-17-2021

In model-based reinforcement learning (MBRL), Wan et al. (2019) showed conditions under which the environment model could produce the expectation of the next feature vector rather than the full distribution, or a sample thereof, with no loss in planning performance. Such expectation models are of interest when the environment is stochastic and non-stationary, and the model is approximate, such as when it is learned using function approximation. In these cases a full distribution model may be impractical and a sample model may be either more expensive computationally or of high variance. Wan et al. considered only planning for prediction to evaluate a fixed policy. In this paper, we treat the control case - planning to improve and find a good approximate policy. We prove that planning with an expectation model must update a state-value function, not an action-value function as previously suggested (e.g., Sorg & Singh, 2010). This opens the question of how planning influences action selections. We consider three strategies for this and present general MBRL algorithms for each. We identify the strengths and weaknesses of these algorithms in computational experiments. Our algorithms and experiments are the first to treat MBRL with expectation models in a general setting.

agent, expectation model, proceedings, (11 more...)

arXiv.org Artificial Intelligence

2104.08543

Country:

North America > Canada > Ontario (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > Middle East > Jordan (0.04)
(10 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback