Goto

Collaborating Authors

 Agents


Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

arXiv.org Artificial Intelligence

Efficient collaborative decision making is an important challenge for multiagent systems. Finding optimal joint actions is especially challenging when each agent has only imperfect information about the state of its environment. Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type. However, representing and solving such games requires space and computation time exponential in the number of agents. This article introduces collaborative graphical Bayesian games (CGBGs), which facilitate more efficient collaborative decision making by decomposing the global payoff function as the sum of local payoff functions that depend on only a few agents. We propose a framework for the efficient solution of CGBGs based on the insight that they posses two different types of independence, which we call agent independence and type independence. In particular, we present a factor graph representation that captures both forms of independence and thus enables efficient solutions. In addition, we show how this representation can provide leverage in sequential tasks by using it to construct a novel method for decentralized partially observable Markov decision processes. Experimental results in both random and benchmark tasks demonstrate the improved scalability of our methods compared to several existing alternatives.


Timed Soft Concurrent Constraint Programs: An Interleaved and a Parallel Approach

arXiv.org Artificial Intelligence

We propose a timed and soft extension of Concurrent Constraint Programming. The time extension is based on the hypothesis of bounded asynchrony: the computation takes a bounded period of time and is measured by a discrete global clock. Action prefixing is then considered as the syntactic marker which distinguishes a time instant from the next one. Supported by soft constraints instead of crisp ones, tell and ask agents are now equipped with a preference (or consistency) threshold which is used to determine their success or suspension. In the paper we provide a language to describe the agents behavior, together with its operational and denotational semantics, for which we also prove the compositionality and correctness properties. After presenting a semantics using maximal parallelism of actions, we also describe a version for their interleaving on a single processor (with maximal parallelism for time elapsing). Coordinating agents that need to take decisions both on preference values and time events may benefit from this language. To appear in Theory and Practice of Logic Programming (TPLP).


Partially Observed, Multi-objective Markov Games

arXiv.org Artificial Intelligence

The intent of this research is to generate a set of non-dominated policies from which one of two agents (the leader) can select a most preferred policy to control a dynamic system that is also affected by the control decisions of the other agent (the follower). The problem is described by an infinite horizon, partially observed Markov game (POMG). At each decision epoch, each agent knows: its past and present states, its past actions, and noise corrupted observations of the other agent's past and present states. The actions of each agent are determined at each decision epoch based on these data. The leader considers multiple objectives in selecting its policy. The follower considers a single objective in selecting its policy with complete knowledge of and in response to the policy selected by the leader. This leader-follower assumption allows the POMG to be transformed into a specially structured, partially observed Markov decision process (POMDP). This POMDP is used to determine the follower's best response policy. A multi-objective genetic algorithm (MOGA) is used to create the next generation of leader policies based on the fitness measures of each leader policy in the current generation. Computing a fitness measure for a leader policy requires a value determination calculation, given the leader policy and the follower's best response policy. The policies from which the leader can select a most preferred policy are the non-dominated policies of the final generation of leader policies created by the MOGA. An example is presented that illustrates how these results can be used to support a manager of a liquid egg production process (the leader) in selecting a sequence of actions to best control this process over time, given that there is an attacker (the follower) who seeks to contaminate the liquid egg production process with a chemical or biological toxin.



Judgment Aggregation: A Primer

Morgan & Claypool Publishers

Judgment aggregation is a mathematical theory of collective decision-making. It concerns the methods whereby individual opinions about logically interconnected issues of interest can, or cannot, be aggregated into one collective stance. Aggregation problems have traditionally been of interest for disciplines like economics and the political sciences, as well as philosophy, where judgment aggregation itself originates from, but have recently captured the attention of disciplines like computer science, artificial intelligence and multi-agent systems. Judgment aggregation has emerged in the last decade as a unifying paradigm for the formalization and understanding of aggregation problems. Still, no comprehensive presentation of the theory is available to date.


Mechanisms for Fair Allocation Problems: No-Punishment Payment Rules in Verifiable Settings

Journal of Artificial Intelligence Research

Mechanism design is considered in the context of fair allocations of indivisible goods with monetary compensation, by focusing on problems where agents' declarations on allocated goods can be verified before payments are performed. A setting is considered where verification might be subject to errors, so that payments have to be awarded under the presumption of innocence, as incorrect declared values do not necessarily mean manipulation attempts by the agents. Within this setting, a mechanism is designed that is shown to be truthful, efficient, and budget-balanced. Moreover, agents' utilities are fairly determined by the Shapley value of suitable coalitional games, and enjoy highly desirable properties such as equal treatment of equals, envy-freeness, and a stronger one called individual-optimality. In particular, the latter property guarantees that, for every agent, her/his utility is the maximum possible one over any alternative optimal allocation. The computational complexity of the proposed mechanism is also studied. It turns out that it is #P-complete so that, to deal with applications with many agents involved, two polynomial-time randomized variants are also proposed: one that is still truthful and efficient, and which is approximately budget-balanced with high probability, and another one that is truthful in expectation, while still budget-balanced and efficient.


A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior

arXiv.org Machine Learning

Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the environment's latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controlled by the other agent's stochastic behavior for which FDM's independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent's behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent's behavior. Empirical evaluation shows that our approach outperforms existing multi-agent reinforcement learning algorithms.


Active Learning for Autonomous Intelligent Agents: Exploration, Curiosity, and Interaction

arXiv.org Artificial Intelligence

In this survey we present different approaches that allow an intelligent agent to explore autonomous its environment to gather information and learn multiple tasks. Different communities proposed different solutions, that are in many cases, similar and/or complementary. These solutions include active learning, exploration/exploitation, online-learning and social learning. The common aspect of all these approaches is that it is the agent to selects and decides what information to gather next. Applications for these approaches already include tutoring systems, autonomous grasping learning, navigation and mapping and human-robot interaction. We discuss how these approaches are related, explaining their similarities and their differences in terms of problem assumptions and metrics of success. We consider that such an integrated discussion will improve inter-disciplinary research and applications.


Multi-period Trading Prediction Markets with Connections to Machine Learning

arXiv.org Machine Learning

We present a new model for prediction markets, in which we use risk measures to model agents and introduce a market maker to describe the trading process. This specific choice on modelling tools brings us mathematical convenience. The analysis shows that the whole market effectively approaches a global objective, despite that the market is designed such that each agent only cares about its own goal. Additionally, the market dynamics provides a sensible algorithm for optimising the global objective. An intimate connection between machine learning and our markets is thus established, such that we could 1) analyse a market by applying machine learning methods to the global objective, and 2) solve machine learning problems by setting up and running certain markets.


Multiagent Only Knowing in Dynamic Systems

Journal of Artificial Intelligence Research

The idea of "only knowing" a collection of sentences, as proposed by Levesque, has been previously shown to be very useful in characterizing knowledge-based agents: in terms of a specification, a precise and perspicuous account of the beliefs and non-beliefs is obtained in a monotonic setting. Levesque's logic is based on a first-order modal language with quantifying-in, thus allowing for de re versus de dicto distinctions, among other things. However, the logic and its recent dynamic extension only deal with the case of a single agent. In this work, we propose a first-order multiagent framework with knowledge, actions, sensing and only knowing, that is shown to inherit all the features of the single agent version. Most significantly, we prove reduction theorems by means of which reasoning about knowledge and actions in the framework simplifies to non-epistemic, non-dynamic reasoning about the initial situation.