Goto

Collaborating Authors

 Agent Societies


Cooperation without Communication

AI Classics

The single unifying assumption in this work is that one or more of the interacting agents will be using artificial intelligence techniques to guide their actions (including, of course, their communication actions). We call this the "intelligentagent paradigm." Within this broad categorization, the many individual efforts to give Al systems the capability to interact with other rational systems are seen as potentially increasing efficiency (by harnessing multiple reasoners to solve problems in parallel) or as necessitated by the distributed nature of the problem (e.g., distributed air traffic control


Fairness in Multi-Agent Sequential Decision-Making

Neural Information Processing Systems

We define a fairness solution criterion for multi-agent decision-making problems, where agents have local interests. This new criterion aims to maximize the worst performance of agents with consideration on the overall performance. We develop a simple linear programming approach and a more scalable game-theoretic approach for computing an optimal fairness policy. This game-theoretic approach formulates this fairness optimization as a two-player, zero-sum game and employs an iterative algorithm for finding a Nash equilibrium, corresponding to an optimal fairness policy. We scale up this approach by exploiting problem structure and value function approximation. Our experiments on resource allocation problems show that this fairness criterion provides a more favorable solution than the utilitarian criterion, and that our game-theoretic approach is significantly faster than linear programming.


Diverse Randomized Agents Vote to Win

Neural Information Processing Systems

We investigate the power of voting among diverse, randomized software agents. With teams of computer Go agents in mind, we develop a novel theoretical model of two-stage noisy voting that builds on recent work in machine learning. This model allows us to reason about a collection of agents with different biases (determined by the first-stage noise models), which, furthermore, apply randomized algorithms to evaluate alternatives and produce votes (captured by the second-stage noise models). We analytically demonstrate that a uniform team, consisting of multiple instances of any single agent, must make a significant number of mistakes, whereas a diverse team converges to perfection as the number of agents grows. Our experiments, which pit teams of computer Go agents against strong agents, provide evidence for the effectiveness of voting when agents are diverse.


Reinforcement Learning and Nonparametric Detection of Game-Theoretic Equilibrium Play in Social Networks

arXiv.org Machine Learning

The first part of the paper presents a reinforcement learning (adaptive filtering) algorithm that facilitates learning an equilibrium by resorting to diffusion cooperation strategies in a social network. Agents form homophilic social groups, within which they exchange past experiences over an undirected graph. It is shown that, if all agents follow the proposed algorithm, their global behavior is attracted to the correlated equilibria set of the game. The second part of the paper provides a test to detect if the actions of agents are consistent with play from the equilibrium of a concave potential game. The theory of revealed preference from microeconomics is used to construct a nonparametric decision test and statistical test which only require the probe and associated actions of agents. A stochastic gradient algorithm is given to optimize the probe in real time to minimize the Type-II error probabilities of the detection test subject to specified Type-I error probability. We provide a real-world example using the energy market, and a numerical example to detect malicious agents in an online social network. Index Terms--Multi-agent signal processing, non-cooperative games, social networks, correlated equilibrium, diffusion cooperation, homophily behavior, revealed preferences, Afriat's theorem, stochastic approximation algorithm.


Emotional Context in Imitation-Based Learning in Multi-Agent Societies

AAAI Conferences

In this paper we explain how IETAL agents learn their environment, and how they build their intrinsic, internal representation of it, which they then use to build their expectations when on quest to satisfy its active drives. As environments change (with or without other agents present in them), the agents learn to new and “forget” irrelevant, “old” associations made. We discuss the concept of emotional context of associations, and show a gallery of simulations of behaviors in small multiagent societies.


Push and Rotate: a Complete Multi-agent Pathfinding Algorithm

Journal of Artificial Intelligence Research

Multi-agent Pathfinding is a relevant problem in a wide range of domains, for example in robotics and video games research. Formally, the problem considers a graph consisting of vertices and edges, and a set of agents occupying vertices. An agent can only move to an unoccupied, neighbouring vertex, and the problem of finding the minimal sequence of moves to transfer each agent from its start location to its destination is an NP-hard problem. We present Push and Rotate, a new algorithm that is complete for Multi-agent Pathfinding problems in which there are at least two empty vertices. Push and Rotate first divides the graph into subgraphs within which it is possible for agents to reach any position of the subgraph, and then uses the simple push, swap, and rotate operations to find a solution; a post-processing algorithm is also presented that eliminates redundant moves. Push and Rotate can be seen as extending Luna and Bekris's Push and Swap algorithm, which we showed to be incomplete in a previous publication. In our experiments we compare our approach with the Push and Swap, MAPP, and Bibox algorithms. The latter algorithm is restricted to a smaller class of instances as it requires biconnected graphs, but can nevertheless be considered state of the art due to its strong performance. Our experiments show that Push and Swap suffers from incompleteness, MAPP is generally not competitive with Push and Rotate, and Bibox is better than Push and Rotate on randomly generated biconnected instances, while Push and Rotate performs better on grids.


Verification of Agent-Based Artifact Systems

Journal of Artificial Intelligence Research

Artifact systems are a novel paradigm for specifying and implementing business processes described in terms of interacting modules called artifacts. Artifacts consist of data and lifecycles, accounting respectively for the relational structure of the artifacts states and their possible evolutions over time. In this paper we put forward artifact-centric multi-agent systems, a novel formalisation of artifact systems in the context of multi-agent systems operating on them. Differently from the usual process-based models of services, we give a semantics that explicitly accounts for the data structures on which artifact systems are defined. We study the model checking problem for artifact-centric multi-agent systems against specifications expressed in a quantified version of temporal-epistemic logic expressing the knowledge of the agents in the exchange. We begin by noting that the problem is undecidable in general. We identify a noteworthy class of systems that admit bisimilar, finite abstractions. It follows that we can verify these systems by investigating their finite abstractions; we also show that the corresponding model checking problem is EXPSPACE-complete. We then introduce artifact-centric programs, compact and declarative representations of the programs governing both the artifact system and the agents. We show that, while these in principle generate infinite-state systems, under natural conditions their verification problem can be solved on finite abstractions that can be effectively computed from the programs. We exemplify the theoretical results here pursued through a mainstream procurement scenario from the artifact systems literature.


Distributed Heuristic Forward Search for Multi-agent Planning

Journal of Artificial Intelligence Research

This paper deals with the problem of classical planning for multiple cooperative agents who have private information about their local state and capabilities they do not want to reveal. Two main approaches have recently been proposed to solve this type of problem -- one is based on reduction to distributed constraint satisfaction, and the other on partial-order planning techniques. In classical single-agent planning, constraint-based and partial-order planning techniques are currently dominated by heuristic forward search. The question arises whether it is possible to formulate a distributed heuristic forward search algorithm for privacy-preserving classical multi-agent planning. Our work provides a positive answer to this question in the form of a general approach to distributed state-space search in which each agent performs only the part of the state expansion relevant to it. The resulting algorithms are simple and efficient -- outperforming previous algorithms by orders of magnitude -- while offering similar flexibility to that of forward-search algorithms for single-agent planning. Furthermore, one particular variant of our general approach yields a distributed version of the A* algorithm that is the first cost-optimal distributed algorithm for privacy-preserving planning.


Cooperative Monitoring to Diagnose Multiagent Plans

Journal of Artificial Intelligence Research

Diagnosing the execution of a Multiagent Plan (MAP) means identifying and explaining action failures (i.e., actions that did not reach their expected effects). Current approaches to MAP diagnosis are substantially centralized, and assume that action failures are independent of each other. In this paper, the diagnosis of MAPs, executed in a dynamic and partially observable environment, is addressed in a fully distributed and asynchronous way; in addition, action failures are no longer assumed as independent of each other. The paper presents a novel methodology, named Cooperative Weak-Committed Monitoring (CWCM), enabling agents to cooperate while monitoring their own actions. Cooperation helps the agents to cope with very scarcely observable environments: what an agent cannot observe directly can be acquired from other agents. CWCM exploits nondeterministic action models to carry out two main tasks: detecting action failures and building trajectory-sets (i.e., structures representing the knowledge an agent has about the environment in the recent past). Relying on trajectory-sets, each agent is able to explain its own action failures in terms of exogenous events that have occurred during the execution of the actions themselves. To cope with dependent failures, CWCM is coupled with a diagnostic engine that distinguishes between primary and secondary action failures. An experimental analysis demonstrates that the CWCM methodology, together with the proposed diagnostic inferences, are effective in identifying and explaining action failures even in scenarios where the system observability is significantly reduced.


Arbitration and Stability in Cooperative Games with Overlapping Coalitions

Journal of Artificial Intelligence Research

Overlapping Coalition Formation (OCF) games, introduced by Chalkiadakis, Elkind, Markakis, Polukarov and Jennings in 2010, are cooperative games where players can simultaneously participate in several coalitions. Capturing the notion of stability in OCF games is a difficult task:deviating players may continue to contribute resources to joint projects with non-deviators, and the crucial question is what payoffs the deviators expect to receive from such projects. Chalkiadakis et al. introduce three stability concepts for OCF games---the conservative core, the refined core, and the optimistic core---that are based on different answers to this question. In this paper, we propose a unified framework for the study of stability in the OCF setting, which encompasses the stability concepts considered by Chalkiadakis et al. as well as a wide variety of alternative stability concepts. Our approach is based on the notion of arbitration functions, which determine the payoff obtained by the deviators, given their deviation and the current allocation of resources. We provide a characterization of stable outcomes under arbitration. We then conduct an in-depth study of four types of arbitration functions, which correspond to four notions of the core; these include the three notions of the core considered by Chalkiadakis et al. Our results complement those of Chalkiadakis et al. and answer questions left open by their work. In particular, we show that OCF games with the conservative arbitration function are essentially equivalent to non-OCF games, by relating the conservative core of an OCF game to the core of a non-overlapping cooperative game, and use this result to obtain a strictly weaker sufficient condition for conservative core non-emptiness than the one given by Chalkiadakis et al.