Goto

Collaborating Authors

 Country




Controlling the Behavior of Animated Presentation Agents in the Interface: Scripting versus Instructing

AI Magazine

Lifelike characters, or animated agents, provide a promising option for interface development because they allow us to draw on communication and interaction styles with which humans are already familiar. In this contribution, we revisit some of our past and ongoing projects to motivate an evolution of character-based presentation systems. This evolution starts from systems in which a character presents information content in the style of a TV presenter. It moves on with the introduction of presentation teams that convey information to the user by performing role plays. To explore new forms of active user involvement during a presentation, the next step can lead to systems that convey information in the style of interactive performances. From a technical point of view, this evaluation is mirrored in different approaches to determine the behavior of the employed characters. By means of concrete applications, we argue that a central planning component for automated agent scripting is not always a good choice, especially not in the case of interactive performances where the user might take on an active role as well.


Interface Agents in Model World Environments

AI Magazine

Choosing an environment is an important decision for agent developers. A key issue in this decision is whether the environment will provide realistic problems for the agent to solve, in the sense that the problems are true to the issues that arise in addressing a particular research question. In addition to realism, other important issues include how tractable problems are that can be formulated in the environment, how easy agent performance can be measured, and whether the environment can be customized or extended for specific research questions. In the ideal environment, researchers can pose realistic but tractable problems to an agent, measure and evaluate its performance, and iteratively rework the environment to explore increasingly ambitious questions, all at a reasonable cost in time and effort. As might be expected, trade-offs dominate the suitability of an environment; however, we have found that the modern graphic user interface offers a good balance among these trade-offs. This article takes a brief tour of agent research in the user interface, showing how significant questions related to vision, planning, learning, cognition, and communication are currently being addressed.


Infinite-Horizon Policy-Gradient Estimation

Journal of Artificial Intelligence Research

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes POMDPs controlled by parameterized stochastic policies. A similar algorithm was proposed by (Kimura et al. 1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free beta (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter beta is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter et al., this volume) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward.


Experiments with Infinite-Horizon, Policy-Gradient Estimation

Journal of Artificial Intelligence Research

In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP). These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter & Bartlett, this volume), which computes biased estimates of the performance gradient in POMDPs. The algorithm's chief advantages are that it uses only one free parameter beta, which has a natural interpretation in terms of bias-variance trade-off, it requires no knowledge of the underlying state, and it can be applied to infinite state, control and observation spaces. We show how the gradient estimates produced by GPOMDP can be used to perform gradient ascent, both with a traditional stochastic-gradient algorithm, and with an algorithm based on conjugate-gradients that utilizes gradient information to bracket maxima in line searches. Experimental results are presented illustrating both the theoretical results of (Baxter & Bartlett, this volume) on a toy problem, and practical aspects of the algorithms on a number of more realistic problems.


AltAlt: Combining Graphplan and Heuristic State Search

AI Magazine

We briefly describe the implementation and evaluation of a novel plan synthesis system, called AltAlt. AltAlt is designed to exploit the complementary strengths of two of the currently popular competing approaches for plan generation: (1) graphplan and (2) heuristic state search. It uses the planning graph to derive effective heuristics that are then used to guide heuristic state search. The heuristics derived from the planning graph do a better job of taking the subgoal interactions into account and, as such, are significantly more effective than existing heuristics. AltAlt was implemented on top of two state-of-the-art planning systems: (1) stan3.0, a graphplan-style planner, and (2) hsp-r, a heuristic search planner.


Creativity at the Metalevel: AAAI-2000 Presidential Address

AI Magazine

Creativity is sometimes taken to be an inexplicable aspect of human activity. By summarizing a considerable body of literature on creativity, I hope to show how to turn some of the best ideas about creativity into programs that are demonstrably more creative than any we have seen to date. I believe the key to building more creative programs is to give them the ability to reflect on and modify their own frameworks and criteria. That is, I believe that the key to creativity is at the metalevel.


TALplanner: A Temporal Logic-Based Planner

AI Magazine

TALplanner is a forward-chaining planner that utilizes domain-dependent knowledge to control search in the state space generated by action invocation. The domain-dependent control knowledge, background knowledge, plans, and goals are all represented using formulas in a temporal logic called tal, which has been developed independently as a formalism for specifying agent narratives and reasoning about them. In the Fifth International Artificial Intelligence Planning and Scheduling Conference planning competition, TALplanner exhibited impressive performance, winning the Outstanding Performance Award in the Domain-Dependent Planning Competition. In this article, we provide an overview of TALplanner