Goto

Collaborating Authors

 Agents


Bayesian Policy Search for Multi-Agent Role Discovery

AAAI Conferences

Bayesian inference is an appealing approach for leveraging prior knowledge in reinforcement learning (RL). In this paper we describe an algorithm for discovering different classes of roles for agents via Bayesian inference. In particular, we develop a Bayesian policy search approach for Multi-Agent RL (MARL), which is model-free and allows for priors on policy parameters. We present a novel optimization algorithm based on hybrid MCMC, which leverages both the prior and gradient information estimated from trajectories. Our experiments in a complex real-time strategy game demonstrate the effective discovery of roles from supervised trajectories, the use of discovered roles for successful transfer to similar tasks, and the discovery of roles through reinforcement learning.


Two-Player Game Structures for Generalized Planning and Agent Composition

AAAI Conferences

In this paper, we review a series of agent behavior synthesis problems under full observability and nondeterminism (partial controllability), ranging from conditional planning, to recently introduced agent planning programs, and to sophisticated forms of agent behavior compositions, and show that all of them can be solved by model checking two-player game structures. These structures are akin to transition systems/Kripke structures, usually adopted in model checking, except that they distinguish (and hence allow to separately quantify) between the actions/moves of two antagonistic players. We show that using them we can implement solvers for several agent behavior synthesis problems.


Coalition Structure Generation based on Distributed Constraint Optimization

AAAI Conferences

Forming effective coalitions is a major research challenge in AI and multi-agent systems (MAS). Coalition Structure generation (CSG) involves partitioning a set of agents into coalitions so that social surplus (the sum of the rewards of all coalitions) is maximized. A partition is called a Coalition Structure (CS). In traditional works, the value of a coalition is given by a black box function called a characteristic function. In this paper, we propose a novel formalization of CSG, i.e., we assume the value of a characteristic function is given by an optimal solution of a distributed constraint optimization problem (DCOP) among the agents of a coalition. A DCOP is a popular approach for modeling cooperative agents, since it is quite general and can formalize various application problems in MAS. At first glance, one might assume that the computational costs required in this approach would be too expensive, since we need to solve an NP-hard problem just to obtain the value of a single coalition. To optimally solve a CSG, we might need to solve n-th power of 2 DCOP problem instances, where n is the number of agents. However, quite surprisingly, we show that an approximation algorithm, whose computational cost is about the same as solving just one DCOP, can find a CS with quality guarantees. More specifically, we develop an algorithm with parameter k that can find a CS whose social surplus is at least max(k/(w*+1), 2k/n) of the optimal CS, where w* is the tree width of a constraint graph. When k=1, the complexity of this algorithm is about the same as solving just one DCOP. These results illustrate that the locality of interactions among agents, which is explicitly modeled in the DCOP formalization, is quite useful in developing an efficient CSG algorithm with quality guarantees.


Finding Optimal Solutions to Cooperative Pathfinding Problems

AAAI Conferences

In cooperative pathfinding problems, non-interfering paths that bring each agent from its current state to its goal state must be planned for multiple agents. We present the first practical, admissible, and complete algorithm for solving problems of this kind. First, we propose a technique called operator decomposition, which can be used to reduce the branching factors of many search algorithms, including algorithms for cooperative pathfinding. We then show how a type of independence common in instances of cooperative pathfinding problems can be exploited. Next, we take the idea of exploiting independent subproblems further by adding improvements that allow the algorithm to recognize many more cases of such independence. Finally, we show empirically that these techniques drastically improve the performance of the standard admissible algorithm for the cooperative pathfinding problem, and that their combination results in a complete algorithm capable of optimally solving relatively large problems in milliseconds.


Parallel Depth First Proof Number Search

AAAI Conferences

The depth first proof number search (df-pn) is an effective and popular algorithm for solving and-or tree problems by using proof and disproof numbers. This paper presents a simple but effective parallelization of the df-pn search algorithm for a shared-memory system. In this parallelization, multiple agents autonomously conduct the df-pn with a shared transposition table. For effective cooperation of agents, virtual proof and disproof numbers are introduced for each node, which is an estimation of future proof and disproof numbers by using the number of agents working on the node's descendants as a possible increase. Experimental results on large checkmate problems in shogi, which is a popular chess variant in Japan, show that reasonable increases in speed were achieved with small overheads in memory.


Integrating Reinforcement Learning into a Programming Language

AAAI Conferences

Creating artificial intelligent agents that are high-fidelity simulations of natural agents will require the engagement of behavioral scientists. However, agent programming systems that are accessible to behavioral scientists are too limited to create rich agents, and systems for creating rich agents are accessible mainly to computer scientists, not behavioral scientists. We are solving this problem by engaging behavioral scientists in the design of a programming language, and integrating reinforcement learning into the programming language. This strategy will help our language achieve adaptivity, modularity, and, most importantly, accessibility to behavioral scientists. In addition to allowing behavioral scientist to write rich agent programs, our language — AFABL (A Friendly Behavior Language) — will enable a true discipline of modular agent software engineering with broad implications for games, interactive storytelling, and social simulations.


Multi-Agent Fault Tolerance Inspired by a Computational Analysis of Cancer

AAAI Conferences

My thesis investigates fault tolerance for cooperative agent systems that have some equivalent of self-replication and self-death. Utilizing biologically-inspired mechanisms, I increase multi-agent system robustness for faulty agents when it is unknown exactly which agent is malfunctioning. It is important to determine new ways to increase robustness of a system, as otherwise it cannot be guaranteed to function in all situations and thus cannot be relied upon. Robustness of a system allows agents to recover from errors and thus function continuously, an increasingly important trait as agent systems are deployed in real world scenarios such as sensor networks or surveillance systems where faulty or malicious nodes could disrupt application performance. To achieve robustness, there must either be prevention of all errors, or a technique for recovering from errors after they have occurred. My thesis creates a new fault tolerance mechanism inspired by cancer biology to remove faulty agents, and then re-applies the developed technique to study the removal of biological cancer cells in simulation.


Enhancing Affective Communication in Embodied Conversational Agents

AAAI Conferences

The Embodied Conversational Agents (ECAs) are computergenerated motivation for the study of ECAs, inside PRAIA project, characters whose purpose is to exhibit the same started with the belief that ECAs represent a promising solution properties as humans in face-to-face conversation. The general for responding appropriately to student's in educational goal of researchers in the field of ECAs is to create environments. This work, however, cannot be placed inside agents that can be more natural, believable and easy to use. the "task and Application domains" concentration of the taxonomy Due to the broad scope of research and the multidisciplinary presented above. We are not interested in designing of the field, many other investigations can arise in many different and implementing an ECA to meet the needs and fill a suitable areas, leading researchers to face numerous questions: role within one specific educational environment. We What kind of embodiment to use? What parts of the body to believe that making a general contribution in other concentrations represent? What kind of modalities to explore? What personality will increase the possibilities of future research inside model to consider? Will the ECA have emotions?


Integrating Expert Knowledge and Experience

AAAI Conferences

This My thesis work combines AI, programming language design, incompleteness of perception and dynamism in the environment and software engineering. I am integrating reinforcement creates a strong need for adaptivity. Programming this learning (RL) into a programming language so adaptivity by hand in a language that does not provide builtin that the language achieves three primary goals: accessibility, support for adaptivity is very cumbersome. As I demonstrated adaptivity, and modularity. If I am successful, my or designer specifies the structure of certain parts work will enable a discipline of modular large-scale agent of a program while leaving other portions unspecified, such software engineering while making advanced agent modeling that a learning system can learn how to perform them.


Preferences and Learning in Multi-Agent Negotiation

AAAI Conferences

In online, dynamic environments, the service requested by consumers may not be readily served by the producers. This requires the consumers and producers to negotiate on the content of the service. To automate this process, agents play a key role in e-commerce. As far as the agents' negotiation strategies are concerned, understanding and reasoning on their users' preferences are important to generate the right offers on behalf of their users. Besides taking other participant's needs into account is important to be able to negotiate effectively. However, preferences of participants are almost always private. The best that can happen is that participants may learn each other's preferences through interactions over time. As agents learn each other's preferences, they can provide better-targeted offers and thus enable faster negotiation. My research direction involves representing and reasoning on preferences, and learning preferences though interaction in automated negotiation.