This paper presents a discussion of coordination properties within populations of geospatially distributed embodied agents. We define two axes: interaction mechanisms and population diversity; and we present a new framework designed for exploring the relationship between values along these axes and the efficiency of solutions for a set of related tasks, e.g., foraging, resource allocation and area coverage.
Intelligent transportation systems (ITSs) are envisioned to be crucial for smart cities, which aims at improving traffic flow to improve the life quality of urban residents and reducing congestion to improve the efficiency of commuting. However, several challenges need to be resolved before such systems can be deployed, for example, conventional solutions for Markov decision process (MDP) and single-agent Reinforcement Learning (RL) algorithms suffer from poor scalability, and multi-agent systems suffer from poor communication and coordination. In this paper, we explore the potential of mutual information sharing, or in other words, spatial influence based communication, to optimize traffic light control policy. First, we mathematically analyze the transportation system. We conclude that the transportation system does not have stationary Nash Equilibrium, thereby reinforcement learning algorithms offer suitable solutions. Secondly, we describe how to build a multi-agent Deep Deterministic Policy Gradient (DDPG) system with spatial influence and social group utility incorporated. Then we utilize the grid topology road network to empirically demonstrate the scalability of the new system. We demonstrate three types of directed communications to show the effect of directions of social influence on the entire network utility and individual utility. Lastly, we define "selfish index" and analyze the effect of it on total group utility.
This article reports on the first international Competition of Distributed and Multiagent Planners (CoDMAP). The competition focused on cooperative domain-independent planners compatible with a minimal multiagent extension of the classical planning model. The motivations for the competition were manifold: to standardize the problem description language with a common set of benchmarks, to promote development of multiagent planners both inside and outside of the multiagent research community, and to serve as a prototype for future multiagent planning competitions. The article provides an overview of cooperative multiagent planning, describes a novel variant of standardized input language for encoding mutliagent planning problems and summarizes the key points of organization, competing planners and results of the competition.
This paper describes a purely data-driven solution to a class of sequential decision-making problems with a large number of concurrent online decisions, with applications to computing systems and operations research. We assume that while the micro-level behaviour of the system can be broadly captured by analytical expressions or simulation, the macro-level or emergent behaviour is complicated by non-linearity, constraints, and stochasticity. If we represent the set of concurrent decisions to be computed as a vector, each element of the vector is assumed to be a continuous variable, and the number of such elements is arbitrarily large and variable from one problem instance to another. We first formulate the decision-making problem as a canonical reinforcement learning (RL) problem, which can be solved using purely data-driven techniques. We modify a standard approach known as advantage actor critic (A2C) to ensure its suitability to the problem at hand, and compare its performance to that of baseline approaches on the specific instance of a multi-product inventory management task. The key modifications include a parallelised formulation of the decision-making task, and a training procedure that explicitly recognises the quantitative relationship between different decisions. We also present experimental results probing the learned policies, and their robustness to variations in the data.
This problem investigates the effect of agents" characteristics on the mean performance of a total system when the agents are involved in exclusive use of shared resources. In this paper, we model aa agent that is sensitive to forces originating both from an energy supply base and adjacent agents. We characterize the agents' properties with two parameters and investigate the relationship between these two parameters and performance measures such as the rate of energy replenishment.