Goto

Collaborating Authors

 Agents


From Few to More: Large-scale Dynamic Multiagent Curriculum Learning

arXiv.org Artificial Intelligence

A lot of efforts have been devoted to investigating how agents can learn effectively and achieve coordination in multiagent systems. However, it is still challenging in large-scale multiagent settings due to the complex dynamics between the environment and agents and the explosion of state-action space. In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents. We propose three transfer mechanisms across curricula to accelerate the learning process. Moreover, due to the fact that the state dimension varies across curricula,, and existing network structures cannot be applied in such a transfer setting since their network input sizes are fixed. Therefore, we design a novel network structure called Dynamic Agent-number Network (DyAN) to handle the dynamic size of the network input. Experimental results show that DyMA-CL using DyAN greatly improves the performance of large-scale multiagent learning compared with state-of-the-art deep reinforcement learning approaches. We also investigate the influence of three transfer mechanisms across curricula through extensive simulations.


AI lends a hand to help large retailers win back their customers

#artificialintelligence

It's late Saturday morning and Mrs. Little enters her usual supermarket, eyes fixed on her watch. In front of her, the aisles are overrun with shopping carts overflowing with all different types of products. She plunges into the crowd, weaving her way between the shoppers and dodging the promotional displays which block the middle of the aisles. Somehow, she manages to pick up two packs of water before fighting her way back to the other end of the store to get some dog food. As her cart becomes heavier, it becomes more difficult to maneuver.


Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control

arXiv.org Machine Learning

Multi-agent reinforcement learning (MARL) has recently received considerable attention due to its applicability to a wide range of real-world applications. However, achieving efficient communication among agents has always been an overarching problem in MARL. In this work, we propose Variance Based Control (VBC), a simple yet efficient technique to improve communication efficiency in MARL. By limiting the variance of the exchanged messages between agents during the training phase, the noisy component in the messages can be eliminated effectively, while the useful part can be preserved and utilized by the agents for better performance. Our evaluation using a challenging set of StarCraft II benchmarks indicates that our method achieves $2-10\times$ lower in communication overhead than state-of-the-art MARL algorithms, while allowing agents to better collaborate by developing sophisticated strategies.


Synchronous Rendezvous for Networks of Marine Robots in Large Scale Ocean Monitoring

#artificialintelligence

In this work, we are interested in the synchronous rendezvous of a team of agents deployed on a connected network of orbits. The agents coordinate their motions with neighbors they discovered in the rendezvous zone such that rendezvous occurs periodically and the duration of each rendezvous event is maximized.


Recognizing Top-Monotonic Preference Profiles in Polynomial Time

Journal of Artificial Intelligence Research

We provide the first polynomial-time algorithm for recognizing if aย profile of (possibly weak) preference orders is top-monotonic.ย Top-monotonicity is a generalization of the notions ofย single-peakedness and single-crossingness, defined by Barbera and Moreno. Top-monotonic profiles always have weak Condorcet winnersย and satisfy a variant of the median voter theorem. Our algorithm proceeds by reducing the recognition problem to theย SAT-2CNF problem.


No Press Diplomacy: Modeling Multi-Agent Gameplay

arXiv.org Artificial Intelligence

Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal. Reliance on trust and coordination makes Diplomacy the first non-cooperative multi-agent benchmark for complex sequential social dilemmas in a rich environment. In this work, we focus on training an agent that learns to play the No Press version of Diplomacy where there is no dedicated communication channel between players. We present DipNet, a neural-network-based policy model for No Press Diplomacy. The model was trained on a new dataset of more than 150,000 human games. Our model is trained by supervised learning (SL) from expert trajectories, which is then used to initialize a reinforcement learning (RL) agent trained through self-play. Both the SL and RL agents demonstrate state-of-the-art No Press performance by beating popular rule-based bots.


Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

arXiv.org Artificial Intelligence

Fractals2019 started as a new experimental entry in the RoboCup Soccer 2D Simulation League, based on Gliders2d code base, and advanced to a team winning RoboCup-2019 championship. Our approach is centred on combinatorial optimisation methods, within the framework of Guided Self-Organisation (GSO), with the search guided by local constraints. We present examples of several tactical tasks based on the fully released Gliders2d code (version v2), including the search for an optimal assignment of heterogeneous player types, as well as blocking behaviours, offside trap, and attacking formations. We propose a new method, Dynamic Constraint Annealing, for solving dynamic constraint satisfaction problems, and apply it to optimise thermodynamic potential of collective behaviours, under dynamically induced constraints. 1 Introduction The RoboCup Soccer 2D Simulation League provides a rich dynamic environment, facilitated by the RoboCup Soccer Simulator (RCSS), aimed to test advances in decentralised collective behaviours of autonomous agents. The challenges include concurrent adversarial actions, computational nondetermin-ism, noise and latency in asynchronous perception and actuation, and limited processing time [1-9]. Over the years the progress of the League has been supported by several important base code releases, covering both low-level skills and standardised world models of simulated agents [10-13]. The release in 2010 of the base code of HELIOS team, agent2d-3.0.0, later upgraded to agent2d-3.1.1,


Robot Capability and Intention in Trust-based Decisions across Tasks

arXiv.org Artificial Intelligence

--In this paper, we present results from a human-subject study designed to explore two facets of human mental models of robots--inferred capability and intention--and their relationship to overall trust and eventual decisions. In particular, we examine delegation situations characterized by uncertainty, and explore how inferred capability and intention are applied across different tasks. We develop an online survey where human participants decide whether to delegate control to a simulated UA V agent. Our study shows that human estimations of robot capability and intent correlate strongly with overall self-reported trust. However, overall trust is not independently sufficient to determine whether a human will decide to trust (delegate) a given task to a robot. Instead, our study reveals that estimations of robot intention, capability, and overall trust are integrated when deciding to delegate. From a broader perspective, these results suggest that calibrating overall trust alone is insufficient; to make correct decisions, humans need (and use) multifaceted mental models when collaborating with robots across multiple contexts. I NTRODUCTION Trust is a cornerstone of long-lasting collaboration in human teams, and is crucial for human-robot cooperation [1]. For example, human trust in robots influences usage [2], and willingness to accept information or suggestions [3]. Misplaced trust in robots can lead to poor task-allocation and unsatisfactory outcomes.


Modelling Bushfire Evacuation Behaviours

arXiv.org Artificial Intelligence

Bushfires pose a significant threat to Australia's regional areas. To minimise risk and increase resilience, communities need robust evacuation strategies that account for people's likely behaviour both before and during a bushfire. Agent-based modelling (ABM) offers a practical way to simulate a range of bushfire evacuation scenarios. However, the ABM should reflect the diversity of possible human responses in a given community. The Belief-Desire-Intention (BDI) cognitive model captures behaviour in a compact representation that is understandable by domain experts. Within a BDI-ABM simulation, individual BDI agents can be assigned profiles that determine their likely behaviour. Over a population of agents their collective behaviour will characterise the community response. These profiles are drawn from existing human behaviour research and consultation with emergency services personnel and capture the expected behaviours of identified groups in the population, both prior to and during an evacuation. A realistic representation of each community can then be formed, and evacuation scenarios within the simulation can be used to explore the possible impact of population structure on outcomes. It is hoped that this will give an improved understanding of the risks associated with evacuation, and lead to tailored evacuation plans for each community to help them prepare for and respond to bushfire.


An Open-Source Framework for Adaptive Traffic Signal Control

arXiv.org Artificial Intelligence

Developing optimal transportation control systems at the appropriate scale can be difficult as cities' transportation systems can be large, complex and stochastic. Intersection traffic signal controllers are an important element of modern transportation infrastructure where sub-optimal control policies can incur high costs to many users. Many adaptive traffic signal controllers have been proposed by the community but research is lacking regarding their relative performance difference - which adaptive traffic signal controller is best remains an open question. This research contributes a framework for developing and evaluating different adaptive traffic signal controller models in simulation - both learning and non-learning - and demonstrates its capabilities. The framework is used to first, investigate the performance variance of the modelled adaptive traffic signal controllers with respect to their hyperparameters and second, analyze the performance differences between controllers with optimal hyperparameters. The proposed framework contains implementations of some of the most popular adaptive traffic signal controllers from the literature; Webster's, Max-pressure and Self-Organizing Traffic Lights, along with deep Q-network and deep deterministic policy gradient reinforcement learning controllers. This framework will aid researchers by accelerating their work from a common starting point, allowing them to generate results faster with less effort.