Motivated by efficient algorithms for solving combina- torial and discrete instances of the multi-agent pathfinding problem, this report investigates ways to utilize such solutions to solve similar problems in the continuous domain. While a simple discretization of the space which allows the direct application of combinatorial algorithms seems like a straightforward solution, there are additional constraints that such a discretization needs to satisfy in order to be able to provide some form of completeness guarantees in general configuration spaces. This report reviews ideas on how to utilize combinatorial algorithms to solve continuous multi-agent pathfinding problems. It aims to collect feedback from the community regarding the importance and the complexity of this challenge, as well as the appropriateness of the solutions considered here.
Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuousaction domains, multiagent coordination domains with continuous actions have received relatively few investigations. In this paper, we propose an independent learner hierarchical method, named Sample Continuous Coordination with recursive Frequency Maximum Q-Value (SCC-rFMQ), which divides the cooperative problem with continuous actions into two layers. The first layer samples a finite set of actions from the continuous action spaces by a re-sampling mechanism with variable exploratory rates, and the second layer evaluates the actions in the sampled action set and updates the policy using a reinforcement learning cooperative method. By constructing cooperative mechanisms at both levels, SCC-rFMQ can handle cooperative problems in continuous action cooperative Markov games effectively. The effectiveness of SCC-rFMQ is experimentally demonstrated on two well-designed games, i.e., a continuous version of the climbing game and a cooperative version of the boat problem. Experimental results show that SCC-rFMQ outperforms other reinforcement learning algorithms. A large number of multiagent coordination domains involve continuous action spaces, such as robot soccer  and multiplayer online battle arena game . In such environments, agents not only need to coordinate with other agents towards desirable outcomes efficiently but also have to deal with infinitely large action spaces.
This article reports on the first international Competition of Distributed and Multiagent Planners (CoDMAP). The competition focused on cooperative domain-independent planners compatible with a minimal multiagent extension of the classical planning model. The motivations for the competition were manifold: to standardize the problem description language with a common set of benchmarks, to promote development of multiagent planners both inside and outside of the multiagent research community, and to serve as a prototype for future multiagent planning competitions. The article provides an overview of cooperative multiagent planning, describes a novel variant of standardized input language for encoding mutliagent planning problems and summarizes the key points of organization, competing planners and results of the competition.
Treatable infectious diseases are a critical challenge for public health. Outreach campaigns can encourage undiagnosed patients to seek treatment but must be carefully targeted to make the most efficient use of limited resources. We present an algorithm to optimally allocate limited outreach resources among demographic groups in the population. The algorithm uses a novel multiagent model of disease spread which both captures the underlying population dynamics and is amenable to optimization. Our algorithm extends, with provable guarantees, to a stochastic setting where we have only a distribution over parameters such as the contact pattern between agents. We evaluate our algorithm on two instances where this distribution is inferred from real world data: tuberculosis in India and gonorrhea in the United States. Our algorithm produces a policy which is predicted to avert an average of least 8,000 person-years of tuberculosis and 20,000 person-years of gonorrhea annually compared to current policy.
Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces. However, to the best of our knowledge, no previous work has ever succeeded in applying DRL to multi-agent problems with discrete-continuous hybrid (or parameterized) action spaces which is very common in practice. Our work fills this gap by proposing two novel algorithms: Deep Multi-Agent Parameterized Q-Networks (Deep MAPQN) and Deep Multi-Agent Hierarchical Hybrid Q-Networks (Deep MAHHQN). We follow the centralized training but decentralized execution paradigm: different levels of communication between different agents are used to facilitate the training process, while each agent executes its policy independently based on local observations during execution. Our empirical results on several challenging tasks (simulated RoboCup Soccer and game Ghost Story) show that both Deep MAPQN and Deep MAHHQN are effective and significantly outperform existing independent deep parameterized Q-learning method.