Agent Societies
Comparative Analysis of Frameworks for Knowledge-Intensive Intelligent Agents
Jones, Randolph M., Wray, Robert E.
A recurring requirement for human-level artificial intelligence is the incorporation of vast amounts of knowledge into a software agent that can use the knowledge in an efficient and organized fashion. This article discusses representations and processes for agents and behavior models that integrate large, diverse knowledge stores, are long-lived, and exhibit high degrees of competence and flexibility while interacting with complex environments. There are many different approaches to building such agents, and understanding the important commonalities and differences between approaches is often difficult. We introduce a new approach to comparing frameworks based on the notions of commitment, reconsideration, and a categorization of representations and processes. We review four agent frameworks, concentrating on the major representations and processes each directly supports. By organizing the approaches according to a common nomenclature, the analysis highlights points of similarity and difference and suggests directions for integrating and unifying disparate approaches and for incorporating research results from one framework into alternatives.
Asynchronous Partial Overlay: A New Algorithm for Solving Distributed Constraint Satisfaction Problems
Distributed Constraint Satisfaction (DCSP) has long been considered an important problem in multi-agent systems research. This is because many real-world problems can be represented as constraint satisfaction and these problems often present themselves in a distributed form. In this article, we present a new complete, distributed algorithm called asynchronous partial overlay (APO) for solving DCSPs that is based on a cooperative mediation process. The primary ideas behind this algorithm are that agents, when acting as a mediator, centralize small, relevant portions of the DCSP, that these centralized subproblems overlap, and that agents increase the size of their subproblems along critical paths within the DCSP as the problem solving unfolds. We present empirical evidence that shows that APO outperforms other known, complete DCSP techniques.
Negotiating Socially Optimal Allocations of Resources
Endriss, U., Maudet, N., Sadri, F., Toni, F.
A multiagent system may be thought of as an artificial society of autonomous software agents and we can apply concepts borrowed from welfare economics and social choice theory to assess the social welfare of such an agent society. In this paper, we study an abstract negotiation framework where agents can agree on multilateral deals to exchange bundles of indivisible resources. We then analyse how these deals affect social welfare for different instances of the basic framework and different interpretations of the concept of social welfare itself. In particular, we show how certain classes of deals are both sufficient and necessary to guarantee that a socially optimal allocation of resources will be reached eventually.
Solving Transition Independent Decentralized Markov Decision Processes
Becker, R., Zilberstein, S., Lesser, V., Goldman, C. V.
Formal treatment of collaborative multi-agent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of these models remains a serious obstacle. To overcome this complexity barrier, we identify a specific class of decentralized MDPs in which the agents' transitions are independent. The class consists of independent collaborating agents that are tied together through a structured global reward function that depends on all of their histories of states and actions. We present a novel algorithm for solving this class of problems and examine its properties, both as an optimal algorithm and as an anytime algorithm. To the best of our knowledge, this is the first algorithm to optimally solve a non-trivial subclass of decentralized MDPs. It lays the foundation for further work in this area on both exact and approximate algorithms.
Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis
Goldman, C. V., Zilberstein, S.
The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general problem has been shown to be NEXP-complete. In this paper, we identify classes of decentralized control problems whose complexity ranges between NEXP and P. In particular, we study problems characterized by independent transitions, independent observations, and goal-oriented objective functions. Two algorithms are shown to solve optimally useful classes of goal-oriented decentralized processes in polynomial time. This paper also studies information sharing among the decision-makers, which can improve their performance. We distinguish between three ways in which agents can exchange information: indirect communication, direct communication and sharing state features that are not controlled by the agents. Our analysis shows that for every class of problems we consider, introducing direct or indirect communication does not change the worst-case complexity. The results provide a better understanding of the complexity of decentralized control problems that arise in practice and facilitate the development of planning algorithms for these problems.
PHA*: Finding the Shortest Path with A* in An Unknown Physical Environment
Felner, A., Stern, R., Ben-Yair, A., Kraus, S., Netanyahu, N.
We address the problem of finding the shortest path between two points in an unknown real physical environment, where a traveling agent must move around in the environment to explore unknown territory. We introduce the Physical-A* algorithm (PHA*) for solving this problem. PHA* expands all the mandatory nodes that A* would expand and returns the shortest path between the two points. However, due to the physical nature of the problem, the complexity of the algorithm is measured by the traveling effort of the moving agent and not by the number of generated nodes, as in standard A*. PHA* is presented as a two-level algorithm, such that its high level, A*, chooses the next node to be expanded and its low level directs the agent to that node in order to explore it. We present a number of variations for both the high-level and low-level procedures and evaluate their performance theoretically and experimentally. We show that the travel cost of our best variation is fairly close to the optimal travel cost, assuming that the mandatory nodes of A* are known in advance. We then generalize our algorithm to the multi-agent case, where a number of cooperative agents are designed to solve the problem. Specifically, we provide an experimental implementation for such a system. It should be noted that the problem addressed here is not a navigation problem, but rather a problem of finding the shortest path between two points for future usage.
K-Implementation
This paper discusses an interested party who wishes to influence the behavior of agents in a game (multi-agent interaction), which is not under his control. The interested party cannot design a new game, cannot enforce agents' behavior, cannot enforce payments by the agents, and cannot prohibit strategies available to the agents. However, he can influence the outcome of the game by committing to non-negative monetary transfers for the different strategy profiles that may be selected by the agents. The interested party assumes that agents are rational in the commonly agreed sense that they do not use dominated strategies. Hence, a certain subset of outcomes is implemented in a given game if by adding non-negative payments, rational players will necessarily produce an outcome in this subset. Obviously, by making sufficiently big payments one can implement any desirable outcome. The question is what is the cost of implementation? In this paper we introduce the notion of k-implementation of a desired set of strategy profiles, where k stands for the amount of payment that need to be actually made in order to implement desirable outcomes. A major point in k-implementation is that monetary offers need not necessarily materialize when following desired behaviors. We define and study k-implementation in the contexts of games with complete and incomplete information. In the latter case we mainly focus on the VCG games. Our setting is later extended to deal with mixed strategies using correlation devices. Together, the paper introduces and studies the implementation of desirable outcomes by a reliable party who cannot modify game rules (i.e. provide protocols), complementing previous work in mechanism design, while making it more applicable to many realistic CS settings.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nashequilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated ifthe agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems:identifying the game and learning to play.