Goto

Collaborating Authors

Wang

AAAI Conferences

There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent's demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other's internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent's confidence in its representation of the source agent's policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.


Young

AAAI Conferences

We investigate the problem of learning the control of small groups of units in combat situations in Real Time Strategy (RTS) games. AI systems may acquire such skills by observing and learning from expert players, or other AI systems performing those tasks. However, access to training data may be limited, and representations based on metric information -- position, velocity, orientation etc. -- may be brittle, difficult for learning mechanisms to work with, and generalise poorly to new situations. In this work we apply \textit{qualitative spatial relations} to compress such continuous, metric state-spaces into symbolic states, and show that this makes the learning problem easier, and allows for more general models of behaviour. Models learnt from this representation are used to control situated agents, and imitate the observed behaviour of both synthetic (pre-programmed) agents, as well as the behaviour of human-controlled agents on a number of canonical micro-management tasks. We show how a Monte-Carlo method can be used to decompress qualitative data back in to quantitative data for practical use in our control system. We present our work applied to the popular RTS game Starcraft.


Information Modeling for a Dynamic Representation of an Emergency Situation

arXiv.org Artificial Intelligence

In this paper we propose an approach to build a decision support system that can help emergency planners and responders to detect and manage emergency situations. The internal mechanism of the system is independent from the treated application. Therefore, we think the system may be used or adapted easily to different case studies. We focus here on a first step in the decision-support process which concerns the modeling of information issued from the perceived environment and their representation dynamically using a multiagent system. This modeling was applied on the RoboCupRescue Simulation System. An implementation and some results are presented here.


Learning Policy Representations in Multiagent Systems

arXiv.org Artificial Intelligence

Modeling agent behavior is central to understanding the emergence of complex phenomena in multiagent systems. Prior work in agent modeling has largely been task-specific and driven by hand-engineering domain-specific prior knowledge. We propose a general learning framework for modeling agent behavior in any multiagent system using only a handful of interaction data. Our framework casts agent modeling as a representation learning problem. Consequently, we construct a novel objective inspired by imitation learning and agent identification and design an algorithm for unsupervised learning of representations of agent policies. We demonstrate empirically the utility of the proposed framework in (i) a challenging high-dimensional competitive environment for continuous control and (ii) a cooperative environment for communication, on supervised predictive tasks, unsupervised clustering, and policy optimization using deep reinforcement learning.


Konidaris

AAAI Conferences

We consider the problem of constructing a symbolic description of a continuous, low-level environment for use in planning. We show that symbols that can represent the preconditions and effects of an agent's actions are both necessary and sufficient for high-level planning. This eliminates the symbol design problem when a representation must be constructed in advance, and in principle enables an agent to autonomously learn its own symbolic representations. The resulting representation can be converted into PDDL, a canonical high-level planning representation that enables very fast planning.