Goto

Collaborating Authors

 Lesser, Victor


DJAO: A Communication-Constrained DCOP Algorithm that Combines Features of ADOPT and Action-GDL

AAAI Conferences

In this paper we propose a novel DCOP algorithm, called DJAO, that is able toefficiently find a solution with low communication overhead; this algorithm can be used for optimal and bounded approximate solutions by appropriately setting the error bounds. Our approach builds on distributed junction trees used in Action-GDL to represent independence relationsamong variables. We construct an AND/OR search space based on these junction trees.This new type of search space results in higher degrees for each OR node, consequently yielding a more efficient search graph in the distributed settings. DJAO uses a branch-and-bound search algorithm to distributedly find solutions within this search graph. We introduce heuristics to compute the upper and lower boundestimates that the search starts with, which is integral to our approach for reducing communication overhead. We empirically evaluate our approach in various settings.


Compact Mathematical Programs For DEC-MDPs With Structured Agent Interactions

arXiv.org Artificial Intelligence

To deal with the prohibitive complexity of calculating policies in Decentralized MDPs, researchers have proposed models that exploit structured agent interactions. Settings where most agent actions are independent except for few actions that affect the transitions and/or rewards of other agents can be modeled using Event-Driven Interactions with Complex Rewards (EDI-CR). Finding the optimal joint policy can be formulated as an optimization problem. However, existing formulations are too verbose and/or lack optimality guarantees. We propose a compact Mixed Integer Linear Program formulation of EDI-CR instances. The key insight is that most action sequences of a group of agents have the same effect on a given agent. This allows us to treat these sequences similarly and use fewer variables. Experiments show that our formulation is more compact and leads to faster solution times and better solutions than existing formulations.


Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

AAAI Conferences

In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly outperforms the nearly-optimal non-communication policies computed offline.


Multi-Agent Learning with Policy Prediction

AAAI Conferences

Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting the basic gradient ascent approach with policy prediction. We prove that this augmentation results in a stronger notion of convergence than the basic gradient ascent, that is, strategies converge to a Nash equilibrium within a restricted class of iterated games. Motivated by this augmentation, we then propose a new practical multi-agent reinforcement learning (MARL) algorithm exploiting approximate policy prediction. Empirical results show that it converges faster and in a wider variety of situations than state-of-the-art MARL algorithms.


Towards Multiagent Meta-level Control

AAAI Conferences

Embedded systems consisting of collaborating agents capable of interacting with their environment are becoming ubiquitous. It is crucial for these systems to be able to adapt to the dynamic and uncertain characteristics of an open environment. In this paper, we argue that multiagent meta-level control (MMLC) is an effective way to determine when this adaptation process should be done and how much effort should be invested in adaptation as opposed to continuing with the current action plan. We describe a reinforcement learning based approach to learn decentralized meta-control policies offline. We then propose to use the learned reward model as input to a global optimization algorithm to avoid conflicting meta-level decisions between coordinating agents. Our initial experiments in the context of NetRads, a multiagent tornado tracking application show that MMLC significantly improves performance in a 3-agent network.


An Ensemble Learning and Problem Solving Architecture for Airspace Management

AAAI Conferences

In this paper we describe the application of a novel learning and problem solving architecture to the domain of airspace management, where multiple requests for the use of airspace need to be reconciled and managed automatically. The key feature of our "Generalized Integrated Learning Architecture" (GILA) is a set of integrated learning and reasoning (ILR) systems coordinated by a central meta-reasoning executive (MRE). Each ILR learns independently from the same training example and contributes to problem-solving in concert with other ILRs as directed by the MRE. Formal evaluations show that our system performs as well as or better than humans after learning from the same training data. Further, GILA outperforms any individual ILR run in isolation, thus demonstrating the power of the ensemble architecture for learning and problem solving.


Leveled-Commitment Contracting: A Backtracking Instrument for Multiagent Systems

AI Magazine

In (automated) negotiation systems for self-interested agents, contracts have traditionally been binding. The level of commitment is set by decommitting penalties. To be freed from the contract, an agent simply pays its penalty to the other contract party(ies). We show that despite such strategic decommitting, leveled commitment increases the expected payoffs of all contract parties and can enable deals that are impossible under full commitment.


Leveled-Commitment Contracting: A Backtracking Instrument for Multiagent Systems

AI Magazine

In (automated) negotiation systems for self-interested agents, contracts have traditionally been binding. They do not accommodate future events. Contingency contracts address this but are often impractical. As an alternative, we propose leveledcommitment contracts. The level of commitment is set by decommitting penalties. To be freed from the contract, an agent simply pays its penalty to the other contract party(ies). A self-interested agent will be ruluctant to decommit because some other contract party might decommit, in which case the former agent gets freed from the contract, does not incur a penalty, and collects a penalty from the other party. We show that despite such strategic decommitting, leveled commitment increases the expected payoffs of all contract parties and can enable deals that are impossible under full commitment. Different decommitting mechanisms are introduced and compared. Practical prescriptions for market designers are presented. A contract optimizer, ECOMMITTER, is provided on the web.