Undirected Networks
Training Algorithms for Hidden Markov Models using Entropy Based Distance Functions
Singer, Yoram, Warmuth, Manfred K.
By adapting a framework used for supervised learning, we construct iterative algorithms that maximize the likelihood of the observations while also attempting to stay "close" to the current estimated parameters. We use a bound on the relative entropy between the two HMMs as a distance measure betweenthem. The result is new iterative training algorithms which are similar to the EM (Baum-Welch) algorithm for training HMMs. The proposed algorithms are composed of a step similar to the expectation step of Baum-Welch and a new update of the parameters which replaces the maximization (re-estimation) step. The algorithm takes only negligibly moretime per iteration and an approximated version uses the same expectation step as Baum-Welch.
Machine-Learning Research
Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.
Statistical Techniques for Natural Language Parsing
I review current statistical work on syntactic parsing and then consider part-of-speech tagging, which was the first syntactic problem to successfully be attacked by statistical techniques and also serves as a good warm-up for the main topic-statistical parsing. Here, I consider both the simplified case in which the input string is viewed as a string of parts of speech and the more interesting case in which the parser is guided by statistical information about the particular words in the sentence. Finally, I anticipate future research directions.
An Overview of Empirical Natural Language Processing
Brill, Eric, Mooney, Raymond J.
In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, information extraction, and machine translation. This article presents an introduction to the series of specialized articles on these topics and attempts to describe and explain the growing interest in using learning methods to aid the development of natural language processing systems.
A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains
Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterministic and the state of the world is not completely observable. It is difficult to solve POMDPs exactly. This paper proposes a new approximation scheme. The basic idea is to transform a POMDP into another one where additional information is provided by an oracle. The oracle informs the planning agent that the current state of the world is in a certain region. The transformed POMDP is consequently said to be region observable. It is easier to solve than the original POMDP. We propose to solve the transformed POMDP and use its optimal policy to construct an approximate policy for the original POMDP. By controlling the amount of additional information that the oracle provides, it is possible to find a proper tradeoff between computational time and approximation quality. In terms of algorithmic contributions, we study in details how to exploit region observability in solving the transformed POMDP. To facilitate the study, we also propose a new exact algorithm for general POMDPs. The algorithm is conceptually simple and yet is significantly more efficient than all previous exact algorithms.
Making an Impact: Artificial Intelligence at the Jet Propulsion Laboratory
Chien, Steve, DeCoste, Dennis, Doyle, Richard, Stolorz, Paul
The National Aeronautics and Space Administration (NASA) is being challenged to perform more frequent and intensive space-exploration missions at greatly reduced cost. Nowhere is this challenge more acute than among robotic planetary exploration missions that the Jet Propulsion Laboratory (JPL) conducts for NASA. This article describes recent and ongoing work on spacecraft autonomy and ground systems that builds on a legacy of existing success at JPL applying AI techniques to challenging computational problems in planning and scheduling, real-time monitoring and control, scientific data analysis, and design automation.
The 1996 AAAI Mobile Robot Competition and Exhibition
Kortenkamp, David, Nourbakhsh, Illah, Hinkle, David
The Fifth Annual AAAI Mobile Robot Competition and Exhibition was held in Portland, Oregon, in conjunction with the Thirteenth National Conference on Artificial Intelligence. The competition consisted of two events: (1) Office Navigation and (2) Clean Up the Tennis Court. The first event stressed navigation and planning. The second event stressed vision sensing and manipulation. In addition to the competition, there was a mobile robot exhibition in which teams demonstrated robot behaviors that did not fit into the competition tasks. The competition and exhibition were unqualified successes, with nearly 20 teams competing. The robot competition raised the standard for autonomous mobile robotics, demonstrating the intelligent integration of perception, deliberation, and action.
Stable Fitted Reinforcement Learning
We describe the reinforcement learning problem, motivate algorithms which seek an approximation to the Q function, and present new convergence results for two such algorithms. 1 INTRODUCTION AND BACKGROUND Imagine an agent acting in some environment. At time t, the environment is in some state Xt chosen from a finite set of states. The agent perceives Xt, and is allowed to choose an action at from some finite set of actions. Meanwhile, the agent experiences a real-valued cost Ct, chosen from a distribution which also depends only on Xt and at and which has finite mean and variance. Such an environment is called a Markov decision process, or MDP.
Learning Fine Motion by Markov Mixtures of Experts
Meila, Marina, Jordan, Michael I.
Brain and Cognitive Sciences Massachussetts Inst. of Technology Massachussetts Inst. of Technology Cambridge, MA 02139 Cambridge, MA 02139 mmp@ai.mit.edu Abstract Compliant control is a standard method for performing fine manipulation tasks, like grasping and assembly, but it requires estimation of the state of contact (s.o.c.) between the robot arm and the objects involved. Here we present a method to learn a model of the movement from measured data. The method requires little or no prior knowledge and the resulting model explicitly estimates the s.o.c. The current s.o.c. is viewed as the hidden state variable of a discrete HMM.
Active Gesture Recognition using Learned Visual Attention
Darrell, Trevor, Pentland, Alex
We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Using vision routines previously implemented for an interactive environment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture. 1 INTRODUCTION Vision has numerous uses in the natural world.