Goto

Collaborating Authors

 Agents


Utilizing Partial Policies for Identifying Equivalence of Behavioral Models

AAAI Conferences

We present a novel approach for identifying exact and approximate behavioral equivalence between models of agents. This is significant because both decision making and game play in multiagent settings must contend with behavioral models of other agents in order to predict their actions. One approach that reduces the complexity of the model space is to group models that are behaviorally equivalent. Identifying equivalence between models requires solving them and comparing entire policy trees. Because the trees grow exponentially with the horizon, our approach is to focus on partial policy trees for comparison and determining the distance between updated beliefs at the leaves of the trees. We propose a principled way to determine how much of the policy trees to consider, which trades off solution quality for efficiency. We investigate this approach in the context of the interactive dynamic influence diagram and evaluate its performance.


When to Stop? That Is the Question

AAAI Conferences

When to make a decision is a key question in decision making problems characterized by uncertainty. In this paper we deal with decision making in environments where the information arrives dynamically. We address the tradeoff between waiting and stopping strategies. On the one hand, waiting to obtain more information reduces the uncertainty, but it comes with a cost. On the other hand, stopping and making a decision based on an expected utility, decreases the cost of waiting, but the decision is made based on uncertain information. In this paper, we prove that computing the optimal time to make a decision that guarantees the optimal utility is NP-hard. We propose a pessimistic approximation that guarantees an optimal decision when the recommendation is to wait. We empirically evaluate our algorithm and show that the quality of the decision is near-optimal and much faster than the optimal algorithm.


Comparing Agents' Success against People in Security Domains

AAAI Conferences

The interaction of people with autonomous agents has become increasingly prevalent. Some of these settings include security domains, where people can be characterized as uncooperative, hostile, manipulative, and tending to take advantage of the situation for their own needs. This makes it challenging to design proficient agents to interact with people in such environments. Evaluating the success of the agents automatically before evaluating them with people or deploying them could alleviate this challenge and result in better designed agents. In this paper we show how Peer Designed Agents (PDAs) -- computer agents developed by human subjects -- can be used as a method for evaluating autonomous agents in security domains. Such evaluation can reduce the effort and costs involved in evaluating autonomous agents interacting with people to validate their efficacy. Our experiments included more than 70 human subjects and 40 PDAs developed by students. The study provides empirical support that PDAs can be used to compare the proficiency of autonomous agents when matched with people in security domains.


The Influence of Emotion Expression on Perceptions of Trustworthiness in Negotiation

AAAI Conferences

When interacting with computer agents, people make inferences about various characteristics of these agents, such as their reliability and trustworthiness. These perceptions are significant, as they influence people's behavior towards the agents, and may foster or inhibit repeated interactions between them. In this paper we investigate whether computer agents can use the expression of emotion to influence human perceptions of trustworthiness. In particular, we study human-computer interactions within the context of a negotiation game, in which players make alternating offers to decide on how to divide a set of resources. A series of negotiation games between a human and several agents is then followed by a "trust game." In this game people have to choose one among several agents to interact with, as well as how much of their resources they will trust to it. Our results indicate that, among those agents that displayed emotion, those whose expression was in accord with their actions (strategy) during the negotiation game were generally preferred as partners in the trust game over those whose emotion expressions and actions did not mesh. Moreover, we observed that when emotion does not carry useful new information, it fails to strongly influence human decision-making behavior in a negotiation setting.


Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

AAAI Conferences

In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly outperforms the nearly-optimal non-communication policies computed offline.


Incentive-Compatible Escrow Mechanisms

AAAI Conferences

The most prominent way to establish trust between buyers and sellers on online auction sites are reputation mechanisms. Two drawbacks of this approach are the reliance on the seller being long-lived and the susceptibility to whitewashing. In this paper, we introduce so-called escrow mechanisms that avoid these problems by installing a trusted intermediary which forwards the payment to the seller only if the buyer acknowledges that the good arrived in the promised condition. We address the incentive issues that arise and design an escrow mechanism that is incentive-compatible, efficient, interim individually rational and ex ante budget-balanced. In contrast to previous work on trust and reputation, our approach does not rely on knowing the sellers' cost functions or the distribution of buyer valuations.


Dominant-Strategy Auction Design for Agents with Uncertain, Private Values

AAAI Conferences

We consider the problem of designing auctions for settings in Theorem 1 (Dominant strategy impossibility (Larson which bidders have to pay a cost to learn about their preferences, and Sandholm 2004a)). There does not exist any mechanism and hence can face tradeoffs between the cost and accuracy that is strategic deliberation-proof, strategy-dependent, of their preference information. Such bidders are called non-misleading, and preference-formation independent in deliberative agents, and have featured in a wide variety of dominant-strategy equilibrium across all possible quasilinear auction models. For example, costly deliberation can model deliberative-agent settings.


Constrained Coalition Formation

AAAI Conferences

The conventional model of coalition formation considers every possible subset of agents as a potential coalition. However, in many real-world applications, there are inherent constraints on feasible coalitions: for instance, certain agents may be prohibited from being in the same coalition, or the coalition structure may be required to consist of coalitions of the same size. In this paper, we present the first systematic study of constrained coalition formation (CCF). We propose a general framework for this problem, and identify an important class of CCF settings, where the constraints specify which groups of agents should/should not work together. We describe a procedure that transforms such constraints into a structured input that allows coalition formation algorithms to identify, without any redundant computations, all the feasible coalitions. We then use this procedure to develop an algorithm for generating an optimal (welfare-maximizing) constrained coalition structure, and show that it outperforms existing state-of-the-art approaches by several orders of magnitude.


A Distributed Anytime Algorithm for Dynamic Task Allocation in Multi-Agent Systems

AAAI Conferences

Our approach Multi-agent task allocation is an important and challenging yields significant reductions in both run-time and communication, problem, which involves deciding how to assign a set thereby increasing real-world applicability. of agents to a set of tasks, both of which may change over In more detail, in this paper we advance the state-ofthe-art time (i.e., it is a dynamic environment). Moreover, it is often in the following ways: first, we present a novel, necessary for heterogeneous agents to form teams (known as online domain pruning algorithm specifically tailored to coalitions) to complete certain tasks in the environment. In dynamic task allocation environments to reduce the number coalitions, agents can often complete tasks more efficiently of potential solutions that need to be considered.


Automated Action Abstraction of Imperfect Information Extensive-Form Games

AAAI Conferences

Multi-agent decision problems can often be formulated as extensive-form games. We focus on imperfect information extensive-form games in which one or more actions at many decision points have an associated continuous or many-valued parameter. A stock trading agent, in addition to deciding whether to buy or not, must decide how much to buy. In no-limit poker, in addition to selecting a probability for each action, the agent must decide how much to bet for each betting action. Selecting values for these parameters makes these games extremely large. Two-player no-limit Texas Hold'em poker with stacks of 500 big blinds has approximately 10 71 states, which is more than 10 50 times more states than two-player limit Texas Hold'em. The main contribution of this paper is a technique that abstracts a game's action space by selecting one, or a small number, of the many values for each parameter. We show that strategies computed using this new algorithm for no-limit Leduc poker exhibit significant utility gains over epsilon-Nash equilibrium strategies computed with standard, hand-crafted parameter value abstractions.