Education
Autonomous Agents as Synthetic Characters
Elliott, Clark, Brzezinski, Jacek
Humans are social creatures. Much of our intelligence derives from our ability to manipulate our environment through collaborative endeavors. Most extant computer programs and interfaces do little to take advantage of such manifestly human talents and interests, leaving broad avenues of human-computer communication unexplored. Although it is still considered controversial, there are many who believe the harnessing of social communication to be rich in possibilities for modern software. In this article, we look at a number of autonomous agent systems that embody their intelligence at least partially through the projection of a believable, engaging, synthetic persona. Among other topics, we touch briefly on samples of research that explore synthetic personality, representations of emotion, societies of fanciful and playful characters, intelligent and engaging automated tutors, and users projected as avatars into virtual worlds.
AAAI News
However, all eligible students are Intelligence (AAAI-98) will be Third Annual Genetic Programming encouraged to apply. After the conference, available in late March by writing to Conference (GP-98), July 22-25 an expense report will be required ncai@aaai.org Please note that the deadline Eleventh Annual Conference on scholarships@aaai.org or at 445 Burgess for early registrations is May 27, 1998. Computational Learning Theory Drive, Menlo Park, CA 94025, The conference will be held July (COLT '98), July 24-26 (theory.lcs.mit. All student scholarship recipients Monona Terrace Convention Center, Fifteenth International Conference will be required to participate in the designed by Frank Lloyd Wright, in on Machine Learning (ICML '98), July Student Volunteer Program to support Madison, Wisconsin.
What Are Intelligence? And Why? 1996 AAAI Presidential Address
This article, derived from the 1996 Association for the Advancement of Artificial Intelligence Presidential Address, explores the notion of intelligence from a variety of perspectives and finds that it "are" many things. It has, for example, been interpreted in a variety of ways even within our own field, ranging from the logical view (intelligence as part of mathematical logic) to the psychological view (intelligence as an empirical phenomenon of the natural world) to a variety of others. One goal of this article is to go back to basics, reviewing the things that we, individually and collectively, have taken as given, in part because we have taken multiple different and sometimes inconsistent things for granted. I believe it will prove useful to expose the tacit assumptions, models, and metaphors that we carry around as a way of understanding both what we're about and why we sometimes seem to be at odds with one another. Intelligence are also many things in the sense that is a product of evolution. Our physical bodies are in many ways overdetermined, unnecessarily complex, and inefficiently designed, that is, the predictable product of the blind search that is evolution. What's manifestly true of our anatomy is also likely true of our cognitive architecture. Natural intelligence is unlikely to be limited by principles of parsimony and is likely to be overdetermined, unnecessarily complex, and inefficiently designed. In this sense, intelligence are many things because is composed of the many elements that have been thrown together over evolutionary timescales. I suggest that in the face of that, searching for minimalism and elegance may be a diversion, for it simply may not be there. Somewhat more crudely put: The human mind is a 400,000-year-old legacy application -- and you expected to find structured programming? I end with a number of speculations, suggesting that there are some niches in the design space of intelligences that are currently underexplored. One example is the view that thinking is in part visual, and hence it might prove useful to develop representations and reasoning mechanisms that reason with diagrams (not just about them) and that take seriously their visual nature. I speculate as well that thinking may be a form of reliving, that re-acting out what we have experienced is one powerful way to think about and solve problems in the world. In this view, thinking is not simply the decontextualized manipulation of abstract symbols, powerful though that may be. Instead, some significant part of our thinking may be the reuse or simulation of our experiences in the environment. In keeping with this, I suggest that it may prove useful to marry the concreteness of reasoning in a model with the power that arises from reasoning abstractly.
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Department of Computer Science, Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA 15213 Abstract Dynamic Programming, Q-Iearning and other discrete Markov Decision Process solvers can be -applied to continuous d-dimensional state-spaces by quantizing the state space into an array of boxes. This is often problematic above two dimensions: a coarse quantization can lead to poor policies, and fine quantization is too expensive. Possible solutions are variable-resolution discretization, or function approximation by neural nets. A third option, which has been little studied in the reinforcement learning literature, is interpolation on a coarse grid. In this paper we study interpolation techniques that can result in vast improvements in the online behavior of the resulting control systems: multilinear interpolation, and an interpolation algorithm based on an interesting regular triangulation of d-dimensional space.
Learning with Noise and Regularizers in Multilayer Neural Networks
We study the effect of noise and regularization in an online gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units; the examples are corrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise and that of weight-decay regularization on the dynamical evolution of the order parameters and the generalization error in various phases of the learning process.
Learning from Demonstration
By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speedup after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor.
Local Bandit Approximation for Optimal Learning Problems
Duff, Michael O., Barto, Andrew G.
A Bayesian formulation of the problem leads to a clear concept of a solution whose computation, however, appears to entail an examination of an intractably-large number of hyperstates. This paper has suggested extending the Gittins index approach (which applies with great power and elegance to the special class of multi-armed bandit processes) to general adaptive MDP's. The hope has been that if certain salient features of the value of information could be captured, even approximately, then one could be led to a reasonable method for avoiding certain defects of certainty-equivalence approaches (problems with identifiability, "metastability"). Obviously, positive evidence, in the form of empirical results from simulation experiments, would lend support to these ideas-work along these lines is underway. Local bandit approximation is but one approximate computational approach for problems of optimal learning and dual control. Most prominent in the literature of control theory is the "wide-sense" approach of [Bar-Shalom & Tse, 1976], which utilizes local quadratic approximations about nominal state/control trajectories. For certain problems, this method has demonstrated superior performance compared to a certainty-equivalence approach, but it is computationally very intensive and unwieldy, particularly for problems with controller dimension greater than one. One could revert to the view of the bandit problem, or general adaptive MDP, as simply a very large MDP defined over hyperstates, and then consider a some- Local Bandit Approximationfor Optimal Learning Problems 1025 what direct approach in which one performs approximate dynamic programming with function approximation over this domain-details of function-approximation, feature-selection, and "training" all become important design issues.