Goto

Collaborating Authors

 Agent Societies


Random Feature Models for Learning Interacting Dynamical Systems

arXiv.org Artificial Intelligence

Particle dynamics and multi-agent systems provide accurate dynamical models for studying and forecasting the behavior of complex interacting systems. They often take the form of a high-dimensional system of differential equations parameterized by an interaction kernel that models the underlying attractive or repulsive forces between agents. We consider the problem of constructing a data-based approximation of the interacting forces directly from noisy observations of the paths of the agents in time. The learned interaction kernels are then used to predict the agents behavior over a longer time interval. The approximation developed in this work uses a randomized feature algorithm and a sparse randomized feature approach. Sparsity-promoting regression provides a mechanism for pruning the randomly generated features which was observed to be beneficial when one has limited data, in particular, leading to less overfitting than other approaches. In addition, imposing sparsity reduces the kernel evaluation cost which significantly lowers the simulation cost for forecasting the multi-agent systems. Our method is applied to various examples, including first-order systems with homogeneous and heterogeneous interactions, second order homogeneous systems, and a new sheep swarming system.


Distributed Interaction Graph Construction for Dynamic DCOPs in Cooperative Multi-agent Systems

arXiv.org Artificial Intelligence

DCOP algorithms usually rely on interaction graphs to operate. In open and dynamic environments, such methods need to address how this interaction graph is generated and maintained among agents. Existing methods require reconstructing the entire graph upon detecting changes in the environment or assuming that new agents know potential neighbors to facilitate connection. We propose a novel distributed interaction graph construction algorithm to address this problem. The proposed method does not assume a predefined constraint graph and stabilizes after disruptive changes in the environment. We evaluate our approach by pairing it with existing DCOP algorithms to solve several generated dynamic problems. The experiment results show that the proposed algorithm effectively constructs and maintains a stable multi-agent interaction graph for open and dynamic environments.


Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values

arXiv.org Artificial Intelligence

While Shapley's analysis was originally thought to quantify the worth of human agents in a team, Research in the field of Multi-Agent Systems (MAS) suggests its application is straightforward to every other possible transferable viable pathways to solve complex tasks [1]. In a MAS utility coalitional game that respects the needed mathematical environment, every agent is, in principle, an individual independent properties. of one another with its own characteristics and skills. The field of possible applications of Shapley and Myerson The main idea is that by assigning to each agent a specific subtask analyses or their generalizations is broad. Shapley analysis or according to its perks and hence exploiting a delocalized its suitable generalizations can be applied for instance to estimate control, it is possible to solve a problem more efficiently. The the contributions of basketball players in a match using the human society itself is an example of a MAS since groups of recorded match data and statistics [3]. If the practitioner possesses individuals usually train according to their nature to exercise some information about the connectivity of interactions, specific professions that require different expertise: medical or, e.g., spatial rules of the game that restrict the interaction personnel, firefighters, engineers, etc. When analyzing the behavior among agents, Shapley and Myerson analyses can be used to of agents in a MAS a question arises immediately: according assess the importance of vertices, i.e., agents, in graphs. Recent to a common goal to be reached, which agent is contributing works investigated the Shapley and Myerson analyses of the most, and which are its most important individual transportation networks [4] and bus-holding strategies [5].


Consensus Learning for Cooperative Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

Almost all multi-agent reinforcement learning algorithms without communication follow the principle of centralized training with decentralized execution. During centralized training, agents can be guided by the same signals, such as the global state. During decentralized execution, however, agents lack the shared signal. Inspired by viewpoint invariance and contrastive learning, we propose consensus learning for cooperative multi-agent reinforcement learning in this paper. Although based on local observations, different agents can infer the same consensus in discrete space. During decentralized execution, we feed the inferred consensus as an explicit input to the network of agents, thereby developing their spirit of cooperation. Our proposed method can be extended to various multi-agent reinforcement learning algorithms with small model changes. Moreover, we carry out them on some fully cooperative tasks and get convincing results.


The Core of Approval Participatory Budgeting with Uniform Costs (or with up to Four Projects) is Non-Empty

arXiv.org Artificial Intelligence

In the Approval Participatory Budgeting problem an agent prefers a set of projects $W'$ over $W$ if she approves strictly more projects in $W'$. A set of projects $W$ is in the core, if there is no other set of projects $W'$ and set of agents $K$ that both prefer $W'$ over $W$ and can fund $W'$. It is an open problem whether the core can be empty, even when project costs are uniform. the latter case is known as the multiwinner voting core. We show that in any instance with uniform costs or with at most four projects (and any number of agents), the core is nonempty.


Partial gathering of mobile agents in dynamic rings

arXiv.org Artificial Intelligence

In this paper, we consider the partial gathering problem of mobile agents in synchronous dynamic bidirectional ring networks. When k agents are distributed in the network, the partial gathering problem requires, for a given positive integer g (< k), that agents terminate in a configuration such that either at least g agents or no agent exists at each node. So far, the partial gathering problem has been considered in static graphs. In this paper, we start considering partial gathering in dynamic graphs. As a first step, we consider this problem in 1-interval connected rings, that is, one of the links in a ring may be missing at each time step. In such networks, focusing on the relationship between the values of k and g, we fully characterize the solvability of the partial gathering problem and analyze the move complexity of the proposed algorithms when the problem can be solved. First, we show that the g-partial gathering problem is unsolvable when k <= 2g. Second, we show that the problem can be solved with O(n log g) time and the total number of O(gn log g) moves when 2g + 1 <= k <= 3g - 2. Third, we show that the problem can be solved with O(n) time and the total number of O(kn) moves when 3g - 1 <= k <= 8g - 4. Notice that since k = O(g) holds when 3g - 1 <= k <= 8g - 4, the move complexity O(kn) in this case can be represented also as O(gn). Finally, we show that the problem can be solved with O(n) time and the total number of O(gn) moves when k >= 8g - 3. These results mean that the partial gathering problem can be solved also in dynamic rings when k >= 2g + 1. In addition, agents require a total number of \Omega(gn) moves to solve the partial (resp., total) gathering problem. Thus, when k >= 3g - 1, agents can solve the partial gathering problem with the asymptotically optimal total number of O(gn) moves.


E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance

arXiv.org Artificial Intelligence

A critical challenge in multi-agent reinforcement learning (MARL) is for multiple agents to efficiently accomplish complex, long-horizon tasks. The agents often have difficulties in cooperating on common goals, dividing complex tasks, and planning through several stages to make progress. We propose to address these challenges by guiding agents with programs designed for parallelization, since programs as a representation contain rich structural and semantic information, and are widely used as abstractions for long-horizon tasks. Specifically, we introduce Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance (E-MAPP), a novel framework that leverages parallel programs to guide multiple agents to efficiently accomplish goals that require planning over 10+ stages. E-MAPP integrates the structural information from a parallel program, promotes the cooperative behaviors grounded in program semantics, and improves the time efficiency via a task allocator. We conduct extensive experiments on a series of challenging, long-horizon cooperative tasks in the Overcooked environment. Results show that E-MAPP outperforms strong baselines in terms of the completion rate, time efficiency, and zero-shot generalization ability by a large margin.


Learning Trust Over Directed Graphs in Multiagent Systems (extended version)

arXiv.org Artificial Intelligence

We address the problem of learning the legitimacy of other agents in a multiagent network when an unknown subset is comprised of malicious actors. We specifically derive results for the case of directed graphs and where stochastic side information, or observations of trust, is available. We refer to this as ``learning trust'' since agents must identify which neighbors in the network are reliable, and we derive a protocol to achieve this. We also provide analytical results showing that under this protocol i) agents can learn the legitimacy of all other agents almost surely, and that ii) the opinions of the agents converge in mean to the true legitimacy of all other agents in the network. Lastly, we provide numerical studies showing that our convergence results hold in practice for various network topologies and variations in the number of malicious agents in the network.


Multi Agent Path Finding using Evolutionary Game Theory

arXiv.org Artificial Intelligence

In this paper, we consider the problem of path finding for a set of homogeneous and autonomous agents navigating a previously unknown stochastic environment. In our problem setting, each agent attempts to maximize a given utility function while respecting safety properties. Our solution is based on ideas from evolutionary game theory, namely replicating policies that perform well and diminishing ones that do not. We do a comprehensive comparison with related multiagent planning methods, and show that our technique beats state of the art RL algorithms in minimizing path length by nearly 30% in large spaces. We show that our algorithm is computationally faster than deep RL methods by at least an order of magnitude. We also show that it scales better with an increase in the number of agents as compared to other methods, path planning methods in particular. Lastly, we empirically prove that the policies that we learn are evolutionarily stable and thus impervious to invasion by any other policy.


ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

arXiv.org Artificial Intelligence

Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which is the ever-changing targets at every iteration when multiple agents update their policies at the same time. Starting from first principle, in this paper, we manage to solve the non-stationarity problem by proposing bidirectional action-dependent Q-learning (ACE). Central to the development of ACE is the sequential decision-making process wherein only one agent is allowed to take action at one time. Within this process, each agent maximizes its value function given the actions taken by the preceding agents at the inference stage. In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action. Given the design of bidirectional dependency, ACE effectively turns a multiagent MDP into a single-agent MDP. We implement the ACE framework by identifying the proper network representation to formulate the action dependency, so that the sequential decision process is computed implicitly in one forward pass. To validate ACE, we compare it with strong baselines on two MARL benchmarks. Empirical experiments demonstrate that ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge by a large margin. In particular, on SMAC tasks, ACE achieves 100% success rate on almost all the hard and super-hard maps. We further study extensive research problems regarding ACE, including extension, generalization, and practicability. Code is made available to facilitate further research.