Goto

Collaborating Authors

 Agent Societies


Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

arXiv.org Machine Learning

Multi-agent reinforcement learning has been successfully applied to a number of challenging problems. Despite these empirical successes, theoretical understanding of different algorithms is lacking, primarily due to the curse of dimensionality caused by the exponential growth of the state-action space with the number of agents. We study a fundamental problem of multi-agent linear quadratic regulator in a setting where the agents are partially exchangeable. In this setting, we develop a hierarchical actor-critic algorithm, whose computational complexity is independent of the total number of agents, and prove its global linear convergence to the optimal policy. As linear quadratic regulators are often used to approximate general dynamic systems, this paper provided an important step towards better understanding of general hierarchical mean-field multi-agent reinforcement learning.


Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods

arXiv.org Artificial Intelligence

In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others' payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided.


Training multi-agent AI systems to solve complex tasks through cooperation

#artificialintelligence

A novel approach to cooperative multi-agent reinforcement learning (RL) that assigns tasks to individual agents within a group, thereby improving the entire group's ability to collaborate. We tested this method in the real-time strategy game StarCraft: Brood War, and found that our RL-trained model significantly outperformed computer-controlled players that relied on carefully tuned rule-based baselines. Perhaps most important, these gains carried over to matches with significantly larger armies than what we included in our training scenarios. We're releasing the source code for this approach on our TorchCraftAI GitHub repository, and detailing our results, which indicate that treating collaborative multi-agent RL as a dynamic assignment problem can lead to groups of agents that are better at generalizing to more complex situations. Our approach focuses on multi-agent collaborative (MAC) problems where agents have to carry out multiple intermediate tasks in order to accomplish a larger one.


An Action Language for Multi-Agent Domains: Foundations

arXiv.org Artificial Intelligence

In multi-agent domains (MADs), an agent's action may not just change the world and the agent's knowledge and beliefs about the world, but also may change other agents' knowledge and beliefs about the world and their knowledge and beliefs about other agents' knowledge and beliefs about the world. The goals of an agent in a multi-agent world may involve manipulating the knowledge and beliefs of other agents' and again, not just their knowledge/belief about the world, but also their knowledge about other agents' knowledge about the world. Our goal is to present an action language (mA+) that has the necessary features to address the above aspects in representing and RAC in MADs. mA+ allows the representation of and reasoning about different types of actions that an agent can perform in a domain where many other agents might be present -- such as world-altering actions, sensing actions, and announcement/communication actions. It also allows the specification of agents' dynamic awareness of action occurrences which has future implications on what agents' know about the world and other agents' knowledge about the world. mA+ considers three different types of awareness: full-, partial- awareness, and complete oblivion of an action occurrence and its effects. This keeps the language simple, yet powerful enough to address a large variety of knowledge manipulation scenarios in MADs. The semantics of mA+ relies on the notion of state, which is described by a pointed Kripke model and is used to encode the agent's knowledge and the real state of the world. It is defined by a transition function that maps pairs of actions and states into sets of states. We illustrate properties of the action theories, including properties that guarantee finiteness of the set of initial states and their practical implementability. Finally, we relate mA+ to other related formalisms that contribute to RAC in MADs.


Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances

arXiv.org Artificial Intelligence

Multi-agent reinforcement learning (MARL) has long been a significant and everlasting research topic in both machine learning and control. With the recent development of (single-agent) deep RL, there is a resurgence of interests in developing new MARL algorithms, especially those that are backed by theoretical analysis. In this paper, we review some recent advances a sub-area of this topic: decentralized MARL with networked agents. Specifically, multiple agents perform sequential decision-making in a common environment, without the coordination of any central controller. Instead, the agents are allowed to exchange information with their neighbors over a communication network. Such a setting finds broad applications in the control and operation of robots, unmanned vehicles, mobile sensor networks, and smart grid. This review is built upon several our research endeavors in this direction, together with some progresses made by other researchers along the line. We hope this review to inspire the devotion of more research efforts to this exciting yet challenging area.


Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery

arXiv.org Machine Learning

Human players in professional team sports achieve high level coordination by dynamically choosing complementary skills and executing primitive actions to perform these skills. As a step toward creating intelligent agents with this capability for fully cooperative multi-agent settings, we propose a two-level hierarchical multi-agent reinforcement learning (MARL) algorithm with unsupervised skill discovery. Agents learn useful and distinct skills at the low level via independent Q-learning, while they learn to select complementary latent skill variables at the high level via centralized multi-agent training with an extrinsic team reward. The set of low-level skills emerges from an intrinsic reward that solely promotes the decodability of latent skill variables from the trajectory of a low-level skill, without the need for hand-crafted rewards for each skill. For scalable decentralized execution, each agent independently chooses latent skill variables and primitive actions based on local observations. Our overall method enables the use of general cooperative MARL algorithms for training high level policies and single-agent RL for training low level skills. Experiments on a stochastic high dimensional team game show the emergence of useful skills and cooperative team play. The interpretability of the learned skills show the promise of the proposed method for achieving human-AI cooperation in team sports games.


Collective Learning

arXiv.org Machine Learning

Department of Electrical, Electronic and Information Engineering Alma Mater Studiorum - Universit a di Bologna Bologna, Italy Abstract In this paper, we introduce the concept of collective learning (CL) which exploits the notion of collective intelligence in the field of distributed semi-supervised learning. The proposed framework draws inspiration from the learning behavior of human beings, who alternate phases involving collaboration, confrontation and exchange of views with other consisting of studying and learning on their own. On this regard, CL comprises two main phases: a self-training phase in which learning is performed on local private (labeled) data only and a collective training phase in which proxy-labels are assigned to shared (unlabeled) data by means of a consensus-based algorithm. In the considered framework, heterogeneous systems can be connected over the same network, each with different computational capabilities and resources and everyone in the network may take advantage of the cooperation and will eventually reach higher performance with respect to those it can reach on its own. An extensive experimental campaign on an image classification problem emphasizes the properties of CL by analyzing the performance achieved by the cooperating agents. 1 Introduction The notion of collective intelligence has been firstly introduced in [Engelbart, 1962] and widespread in the sociological field by Pierre L evy in [L evy and Bononno, 1997]. By borrowing the words of L evy, collective intelligence " is a form of universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills ". Moreover, " the basis and goal of collective intelligence is mutual recognition and enrichment of individuals rather than the cult of fetishized or hypostatized communities ". In this paper, we aim to exploit some concepts borrowed from the notion of collective intelligence in a distributed machine learning scenario. In fact, by cooperating with each other, machines may exhibit performance higher than those they can obtain by learning on their own. We call this framework collective learning (CL) . Distributed systems 1 have received a steadily growing attention in the last years and1 When talking about distributed systems, the word distributed can be used with different meanings.


Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

arXiv.org Artificial Intelligence

We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a Scalable Actor-Critic (SAC) framework that exploits the network structure and finds a localized policy that is a $O(\rho^\kappa)$-approximation of a stationary point of the objective for some $\rho\in(0,1)$, with complexity that scales with the local state-action space size of the largest $\kappa$-hop neighborhood of the network.


A Variational Perturbative Approach to Planning in Graph-based Markov Decision Processes

arXiv.org Machine Learning

Coordinating multiple interacting agents to achieve a common goal is a difficult task with huge applicability. This problem remains hard to solve, even when limiting interactions to be mediated via a static interaction-graph. We present a novel approximate solution method for multi-agent Markov decision problems on graphs, based on variational perturbation theory. We adopt the strategy of planning via inference, which has been explored in various prior works. We employ a non-trivial extension of a novel high-order variational method that allows for approximate inference in large networks and has been shown to surpass the accuracy of existing variational methods. To compare our method to two state-of-the-art methods for multi-agent planning on graphs, we apply the method different standard GMDP problems. We show that in cases, where the goal is encoded as a non-local cost function, our method performs well, while state-of-the-art methods approach the performance of random guess. In a final experiment, we demonstrate that our method brings significant improvement for synchronization tasks.


Decentralised Sparse Multi-Task Regression

arXiv.org Machine Learning

We consider a sparse multi-task regression framework for fitting a collection of related sparse models. Representing models as nodes in a graph with edges between related models, a framework that fuses lasso regressions with the total variation penalty is investigated. Under a form of restricted eigenvalue assumption, bounds on prediction and squared error are given that depend upon the sparsity of each model and the differences between related models. This assumption relates to the smallest eigenvalue restricted to the intersection of two cone sets of the covariance matrix constructed from each of the agents' covariances. We show that this assumption can be satisfied if the constructed covariance matrix satisfies a restricted isometry property. In the case of a grid topology high-probability bounds are given that match, up to log factors, the no-communication setting of fitting a lasso on each model, divided by the number of agents. A decentralised dual method that exploits a convex-concave formulation of the penalised problem is proposed to fit the models and its effectiveness demonstrated on simulations against the group lasso and variants.