Agents
Multi-Agent Safe Planning with Gaussian Processes
Zhu, Zheqing, Bıyık, Erdem, Sadigh, Dorsa
Multi-agent safe systems have become an increasingly important area of study as we can now easily have multiple AI-powered systems operating together. In such settings, we need to ensure the safety of not only each individual agent, but also the overall system. In this paper, we introduce a novel multi-agent safe learning algorithm that enables decentralized safe navigation when there are multiple different agents in the environment. This algorithm makes mild assumptions about other agents and is trained in a decentralized fashion, i.e. with very little prior knowledge about other agents' policies. Experiments show our algorithm performs well with the robots running other algorithms when optimizing various objectives.
Explanation Generation for Multi-Modal Multi-Agent Path Finding with Optimal Resource Utilization using Answer Set Programming
The multi-agent path finding (MAPF) problem is a combinatorial search problem that aims at finding paths for multiple agents (e.g., robots) in an environment (e.g., an autonomous warehouse) such that no two agents collide with each other, and subject to some constraints on the lengths of paths. We consider a general version of MAPF, called mMAPF, that involves multi-modal transportation modes (e.g., due to velocity constraints) and consumption of different types of resources (e.g., batteries). The real-world applications of mMAPF require flexibility (e.g., solving variations of mMAPF) as well as explainability. Our earlier studies on mMAPF have focused on the former challenge of flexibility. In this study, we focus on the latter challenge of explainability, and introduce a method for generating explanations for queries regarding the feasibility and optimality of solutions, the nonexistence of solutions, and the observations about solutions. Our method is based on answer set programming. This paper is under consideration for acceptance in TPLP.
Modelling Multi-Agent Epistemic Planning in ASP
Burigana, Alessandro, Fabiano, Francesco, Dovier, Agostino, Pontelli, Enrico
Designing agents that reason and act upon the world has always been one of the main objectives of the Artificial Intelligence community. While for planning in "simple" domains the agents can solely rely on facts about the world, in several contexts, e.g., economy, security, justice and politics, the mere knowledge of the world could be insufficient to reach a desired goal. In these scenarios, epistemic reasoning, i.e., reasoning about agents' beliefs about themselves and about other agents' beliefs, is essential to design winning strategies. This paper addresses the problem of reasoning in multi-agent epistemic settings exploiting declarative programming techniques. In particular, the paper presents an actual implementation of a multi-shot Answer Set Programming-based planner that can reason in multi-agent epistemic settings, called PLATO (ePistemic muLti-agent Answer seT programming sOlver). The ASP paradigm enables a concise and elegant design of the planner, w.r.t. other imperative implementations, facilitating the development of formal verification of correctness. The paper shows how the planner, exploiting an ad-hoc epistemic state representation and the efficiency of ASP solvers, has competitive performance results on benchmarks collected from the literature. It is under consideration for acceptance in TPLP.
Impact of meta-roles on the evolution of organisational institutions
Sedigh, Amir Hosein Afshar, Purvis, Martin K., Savarimuthu, Bastin Tony Roy, Purvis, Maryam A., Frantz, Christopher K.
This paper investigates the impact of changes in agents' beliefs coupled with dynamics in agents' meta-roles on the evolution of institutions. The study embeds agents' meta-roles in the BDI architecture. In this context, the study scrutinises the impact of cognitive dissonance in agents due to unfairness of institutions. To showcase our model, two historical long-distance trading societies, namely Armenian merchants of New-Julfa and the English East India Company are simulated. Results show how change in roles of agents coupled with specific institutional characteristics leads to changes of the rules in the system.
Review of Swarm Intelligence-based Feature Selection Methods
Rostami, Mehrdad, Berahmand, Kamal, Forouzandeh, Saman
In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale datasets. On the other hand, data mining applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. An important issue with these applications is the curse of dimensionality, where the number of features is much higher than the number of patterns. One of the dimensionality reduction approaches is feature selection that can increase the accuracy of the data mining task and reduce its computational complexity. The feature selection method aims at selecting a subset of features with the lowest inner similarity and highest relevancy to the target class. It reduces the dimensionality of the data by eliminating irrelevant, redundant, or noisy data. In this paper, a comparative analysis of different feature selection methods is presented, and a general categorization of these methods is performed. Moreover, in this paper, state-of-the-art swarm intelligence are studied, and the recent feature selection methods based on these algorithms are reviewed. Furthermore, the strengths and weaknesses of the different studied swarm intelligence-based feature selection methods are evaluated.
The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning
Blumenkamp, Jan, Prorok, Amanda
Many real-world problems require the coordination of multiple autonomous agents. Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination. These works use models of cooperative multi-agent systems whereby agents strive to achieve a shared global goal. When considering agents with self-interested local objectives, the standard design choice is to model these as separate learning systems (albeit sharing the same environment). Such a design choice, however, precludes the existence of a single, differentiable communication channel, and consequently prohibits the learning of inter-agent communication strategies. In this work, we address this gap by presenting a learning model that accommodates individual non-shared rewards and a differentiable communication channel that is common among all agents. We focus on the case where agents have self-interested objectives, and develop a learning algorithm that elicits the emergence of adversarial communications. We perform experiments on multi-agent coverage and path planning problems, and employ a post-hoc interpretability technique to visualize the messages that agents communicate to each other. We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents
Hafiz, Abdul Mueed, Bhat, Ghulam Mohiuddin
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate. The learning is however specific to each agent and communication may be satisfactorily designed for the agents. As more complex Deep Q-Networks come to the fore, the overall complexity of the multi-agent system increases leading to issues like difficulty in training, need for higher resources and more training time, difficulty in fine-tuning, etc. To address these issues we propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions, for updation of the experience replay pool of the DQNs, where each agent is a DQN. The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches. It should be noted that the method can be extended to any DQN. As such we use simple DQN and DDQN (Double Q-learning) respectively on three separate tasks i.e. Cartpole-v1 (OpenAI Gym environment), LunarLander-v2 (OpenAI Gym environment) and Maze Traversal (customized environment). The proposed approach outperforms the baseline on these tasks by decent margins respectively.
Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem
Chakrabarti, Kushal, Gupta, Nirupam, Chopra, Nikhil
This paper considers the multi-agent linear least-squares problem in a server-agent network. In this problem, the system comprises multiple agents, each having a set of local data points, that are connected to a server. The goal for the agents is to compute a linear mathematical model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. This goal can be achieved, in principle, using the server-agent variant of the traditional iterative gradient-descent method. The gradient-descent method converges linearly to a solution, and its rate of convergence is lower bounded by the conditioning of the agents' collective data points. If the data points are ill-conditioned, the gradient-descent method may require a large number of iterations to converge. We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method. We rigorously show that the resulting pre-conditioned gradient-descent method, with the proposed iterative pre-conditioning, achieves superlinear convergence when the least-squares problem has a unique solution. In general, the convergence is linear with improved rate of convergence in comparison to the traditional gradient-descent method and the state-of-the-art accelerated gradient-descent methods. We further illustrate the improved rate of convergence of our proposed algorithm through experiments on different real-world least-squares problems in both noise-free and noisy computation environment.
Explanation of Reinforcement Learning Model in Dynamic Multi-Agent System
Recently, there has been increasing interest in transparency and interpretability in Deep Reinforcement Learning (DRL) systems. Verbal explanations, as the most natural way of communication in our daily life, deserve more attention, since they allow users to gain a better understanding of the system which ultimately could lead to a high level of trust and smooth collaboration. This paper reports a novel work in generating verbal explanations for DRL behaviors agent. A rule-based model is designed to construct explanations using a series of rules which are predefined with prior knowledge. A learning model is then proposed to expand the implicit logic of generating verbal explanation to general situations by employing rule-based explanations as training data.
Deep Reinforcement Learning for Field Development Optimization
The field development optimization (FDO) problem represents a challenging mixed-integer nonlinear programming (MINLP) problem in which we seek to obtain the number of wells, their type, location, and drilling sequence that maximizes an economic metric. Evolutionary optimization algorithms have been effectively applied to solve the FDO problem, however, these methods provide only a deterministic (single) solution which are generally not robust towards small changes in the problem setup. In this work, the goal is to apply convolutional neural network-based (CNN) deep reinforcement learning (DRL) algorithms to the field development optimization problem in order to obtain a policy that maps from different states or representation of the underlying geological model to optimal decisions. The proximal policy optimization (PPO) algorithm is considered with two CNN architectures of varying number of layers and composition. Both networks obtained policies that provide satisfactory results when compared to a hybrid particle swarm optimization - mesh adaptive direct search (PSO-MADS) algorithm that has been shown to be effective at solving the FDO problem.