AITopics

In Multi-Agent Reinforcement Learning, communication is critical to encourage cooperation among agents. Communication in realistic wireless networks can be highly unreliable due to network conditions varying with agents' mobility, and stochasticity in the transmission process. We propose a framework to learn practical communication strategies by addressing three fundamental questions: (1) When: Agents learn the timing of communication based on not only message importance but also wireless channel conditions. (2) What: Agents augment message contents with wireless network measurements to better select the game and communication actions. (3) How: Agents use a novel neural message encoder to preserve all information from received messages, regardless of the number and order of messages. Simulating standard benchmarks under realistic wireless network settings, we show significant improvements in game performance, convergence speed and communication efficiency compared with state-of-the-art.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2209.01288

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Congested Urban Networks Tend to Be Insensitive to Signal Settings: Implications for Learning-Based Control

Laval, Jorge, Zhou, Hao

This paper highlights several properties of large urban networks that can have an impact on machine learning methods applied to traffic signal control. In particular, we show that the average network flow tends to be independent of the signal control policy as density increases. This property, which so far has remained under the radar, implies that deep reinforcement learning (DRL) methods becomes ineffective when trained under congested conditions, and might explain DRL's limited success for traffic signal control. Our results apply to all possible grid networks thanks to a parametrization based on two network parameters: the ratio of the expected distance between consecutive traffic lights to the expected green time, and the turning probability at intersections. Networks with different parameters exhibit very different responses to traffic signal control. Notably, we found that no control (i.e. random policy) can be an effective control strategy for a surprisingly large family of networks. The impact of the turning probability turned out to be very significant both for baseline and for DRL policies. It also explains the loss of symmetry observed for these policies, which is not captured by existing theories that rely on corridor approximations without turns. Our findings also suggest that supervised learning methods have enormous potential as they require very little examples to produce excellent policies.

congestion, probability, signal control, (16 more...)

2008.10989

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Weber, Pascal, Wälchli, Daniel, Zeqiri, Mustafa, Koumoutsakos, Petros

Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning

We present the extension of the Remember and Forget for Experience Replay (ReF-ER) algorithm to Multi-Agent Reinforcement Learning (MARL). ReF-ER was shown to outperform state of the art algorithms for continuous control in problems ranging from the OpenAI Gym to complex fluid flows. In MARL, the dependencies between the agents are included in the state-value estimator and the environment dynamics are modeled via the importance weights used by ReF-ER. In collaborative environments, we find the best performance when the value is estimated using individual rewards and we ignore the effects of other actions on the transition map. We benchmark the performance of ReF-ER MARL on the Stanford Intelligent Systems Laboratory (SISL) environments. We find that employing a single feed-forward neural network for the policy and the value function in ReF-ER MARL, outperforms state of the art algorithms that rely on complex neural network architectures.

agent, algorithm, reinforcement learning, (11 more...)

2203.13319

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(11 more...)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Alharthi, Khulud, Abdallah, Zahraa S, Hauert, Sabine

Understandable Controller Extraction from Video Observations of Swarms

Swarm behavior emerges from the local interaction of agents and their environment often encoded as simple rules. Extracting the rules by watching a video of the overall swarm behavior could help us study and control swarm behavior in nature, or artificial swarms that have been designed by external actors. It could also serve as a new source of inspiration for swarm robotics. Yet extracting such rules is challenging as there is often no visible link between the emergent properties of the swarm and their local interactions. To this end, we develop a method to automatically extract understandable swarm controllers from video demonstrations. The method uses evolutionary algorithms driven by a fitness function that compares eight high-level swarm metrics. The method is able to extract many controllers (behavior trees) in a simple collective movement task. We then provide a qualitative analysis of behaviors that resulted in different trees, but similar behaviors. This provides the first steps toward automatic extraction of swarm controllers based on observations.

behavior tree, controller, swarm behavior, (13 more...)

2209.01118

Country:

Europe > United Kingdom > England > Bristol (0.04)
Asia > Middle East > Saudi Arabia (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Lancewicki, Tal, Rosenberg, Aviv, Mansour, Yishay

Cooperative Online Learning in Stochastic and Adversarial MDPs

We study cooperative online learning in stochastic and adversarial Markov decision process (MDP). That is, in each episode, $m$ agents interact with an MDP simultaneously and share information in order to minimize their individual regret. We consider environments with two types of randomness: \emph{fresh} -- where each agent's trajectory is sampled i.i.d, and \emph{non-fresh} -- where the realization is shared by all agents (but each agent's trajectory is also affected by its own actions). More precisely, with non-fresh randomness the realization of every cost and transition is fixed at the start of each episode, and agents that take the same action in the same state at the same time observe the same cost and next state. We thoroughly analyze all relevant settings, highlight the challenges and differences between the models, and prove nearly-matching regret lower and upper bounds. To our knowledge, we are the first to consider cooperative reinforcement learning (RL) with either non-fresh randomness or in adversarial MDPs.

agent, cooperative online learning, learning, (13 more...)

2201.1317

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Education > Educational Setting > Online (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Rahn, Simon, Gödel, Marion, Köster, Gerta, Hofinger, Gesine

Modelling airborne transmission of SARS-CoV-2 at a local scale

The coronavirus disease (COVID-19) pandemic has changed our lives and still poses a challenge to science. Numerous studies have contributed to a better understanding of the pandemic. In particular, inhalation of aerosolised pathogens has been identified as essential for transmission. This information is crucial to slow the spread, but the individual likelihood of becoming infected in everyday situations remains uncertain. Mathematical models help estimate such risks. In this study, we propose how to model airborne transmission of SARS-CoV-2 at a local scale. In this regard, we combine microscopic crowd simulation with a new model for disease transmission. Inspired by compartmental models, we describe agents' health status as susceptible, exposed, infectious or recovered. Infectious agents exhale pathogens bound to persistent aerosols, whereas susceptible agents absorb pathogens when moving through an aerosol cloud left by the infectious agent. The transmission depends on the pathogen load of the aerosol cloud, which changes over time. We propose a 'high risk' benchmark scenario to distinguish critical from non-critical situations. Simulating indoor situations show that the new model is suitable to evaluate the risk of exposure qualitatively and, thus, enables scientists or even decision-makers to better assess the spread of COVID-19 and similar diseases.

agent, pathogen, transmission, (15 more...)

doi: 10.1371/journal.pone.0273820

2111.08547

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Washington > Skagit County (0.04)
Europe > Austria (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Partlan, Nathan, Soto, Luis, Howe, Jim, Shrivastava, Sarthak, El-Nasr, Magy Seif, Marsella, Stacy

To assist game developers in crafting game NPCs, we present EvolvingBehavior, a novel tool for genetic programming to evolve behavior trees in Unreal Engine 4. In an initial evaluation, we compare evolved behavior to hand-crafted trees designed by our researchers, and to randomly-grown trees, in a 3D survival game. We find that EvolvingBehavior is capable of producing behavior approaching the designer's goals in this context. Finally, we discuss implications and future avenues of exploration for co-creative game AI design tools, as well as challenges and difficulties in behavior tree evolution.

behavior tree, designer, node, (14 more...)

2209.0102

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Europe > Greece > Attica > Athens (0.05)
(4 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Software (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

On Almost-Sure Intention Deception Planning that Exploits Imperfect Observers

Fu, Jie

Intention deception involves computing a strategy which deceives the opponent into a wrong belief about the agent's intention or objective. This paper studies a class of probabilistic planning problems with intention deception and investigates how a defender's limited sensing modality can be exploited by an attacker to achieve its attack objective almost surely (with probability one) while hiding its intention. In particular, we model the attack planning in a stochastic system modeled as a Markov decision process (MDP). The attacker is to reach some target states while avoiding unsafe states in the system and knows that his behavior is monitored by a defender with partial observations. Given partial state observations for the defender, we develop qualitative intention deception planning algorithms that construct attack strategies to play against an action-visible defender and an action-invisible defender, respectively. The synthesized attack strategy not only ensures the attack objective is satisfied almost surely but also deceives the defender into believing that the observed behavior is generated by a normal/legitimate user and thus failing to detect the presence of an attack. We show the proposed algorithms are correct and complete and illustrate the deceptive planning methods with examples.

attacker, defender, objective, (15 more...)

2209.00573

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Leveraging Heterogeneous Capabilities in Multi-Agent Systems for Environmental Conflict Resolution

Cao, Michael Enqi, Warnke, Jonas, Han, Yunhai, Ni, Xinpei, Zhao, Ye, Coogan, Samuel

In this paper, we introduce a high-level controller synthesis framework that enables teams of heterogeneous agents to assist each other in resolving environmental conflicts that appear at runtime. This conflict resolution method is built upon temporal-logic-based reactive synthesis to guarantee safety and task completion under specific environment assumptions. In heterogeneous multi-agent systems, every agent is expected to complete its own tasks in service of a global team objective. However, at runtime, an agent may encounter un-modeled obstacles (e.g., doors or walls) that prevent it from achieving its own task. To address this problem, we employ the capabilities of other heterogeneous agents to resolve the obstacle. A controller framework is proposed to redirect agents with the capability of resolving the appropriate obstacles to the required target when such a situation is detected. Three case studies involving a bipedal robot Digit and a quadcopter are used to evaluate the controller performance in action. Additionally, we implement the proposed framework on a physical multi-agent robotic system to demonstrate its viability for real world applications.

agent, digit, obstacle, (17 more...)

2206.01833

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Vinanzi, Samuele, Cangelosi, Angelo

CASPER: Cognitive Architecture for Social Perception and Engagement in Robots

Our world is being increasingly pervaded by intelligent robots with varying degrees of autonomy. To seamlessly integrate themselves in our society, these machines should possess the ability to navigate the complexities of our daily routines even in the absence of a human's direct input. In other words, we want these robots to understand the intentions of their partners with the purpose of predicting the best way to help them. In this paper, we present CASPER (Cognitive Architecture for Social Perception and Engagement in Robots): a symbolic cognitive architecture that uses qualitative spatial reasoning to anticipate the pursued goal of another agent and to calculate the best collaborative behavior. This is performed through an ensemble of parallel processes that model a low-level action recognition and a high-level goal understanding, both of which are formally verified. We have tested this architecture in a simulated kitchen environment and the results we have collected show that the robot is able to both recognize an ongoing goal and to properly collaborate towards its achievement. This demonstrates a new use of Qualitative Spatial Relations applied to the problem of intention reading in the domain of human-robot interaction.

architecture, cognitive architecture, robot, (16 more...)

2209.01012

Country:

North America > United States > New York (0.04)
North America > United States > Indiana (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (1.00)