Goto

Collaborating Authors

 Learning Graphical Models


Bayesian Query Construction for Neural Network Models

Neural Information Processing Systems

If data collection is costly, there is much to be gained by actively selecting particularly informative data points in a sequential way. In a Bayesian decision-theoretic framework we develop a query selection criterion which explicitly takes into account the intended use of the model predictions. By Markov Chain Monte Carlo methods the necessary quantities can be approximated to a desired precision. As the number of data points grows, the model complexity is modified by a Bayesian model selection strategy. The properties of two versions of the criterion ate demonstrated in numerical experiments.


Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

Neural Information Processing Systems

Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due to successes in the theoretical analysis of their behavior in Markov environments. If the Markov assumption is removed, however, neither generally the algorithms nor the analyses continue to be usable. We propose and analyze a new learning algorithm to solve a certain class of non-Markov decision problems. Our algorithm applies to problems in which the environment is Markov, but the learner has restricted access to state information. The algorithm involves a Monte-Carlo policy evaluation combined with a policy improvement method that is similar to that of Markov decision problems and is guaranteed to converge to a local maximum. The algorithm operates in the space of stochastic policies, a space which can yield a policy that performs considerably better than any deterministic policy. Although the space of stochastic policies is continuous-even for a discrete action space-our algorithm is computationally tractable.


Deterministic Annealing Variant of the EM Algorithm

Neural Information Processing Systems

We present a deterministic annealing variant of the EM algorithm maximum likelihood parameter estimation problems. In ourfor approach, the EM process is reformulated as the problem of minimizing the thermodynamic free energy by using the principle of maximum entropy and statistical mechanics analogy. Unlike simulated deterministicallyannealing approaches, this minimization is performed. Moreover, the derived algorithm, unlike the conventional better estimates free of the initialEM algorithm, can obtain parameter values.


Visual Speech Recognition with Stochastic Networks

Neural Information Processing Systems

This paper presents ongoing work on a speaker independent visual speech recognition system. The work presented here builds on previous research efforts in this area and explores the potential use of simple hidden Markov models for limited vocabulary, speaker independent visual speech recognition. The task at hand is recognition of the first four English digits, a task with possible applications in car-phone dialing. The images were modeled as mixtures of independent Gaussian distributions, and the temporal dependencies were captured with standard left-to-right hidden Markov models. The results indicate that simple hidden Markov models may be used to successfully recognize relatively unprocessed image sequences.


Estimating Conditional Probability Densities for Periodic Variables

Neural Information Processing Systems

In this paper we introduce three novel techniques for tackling such problems, and investigate their performance using syntheticdata. We then apply these techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.


Pairwise Neural Network Classifiers with Probabilistic Outputs

Neural Information Processing Systems

Multi-class classification problems can be efficiently solved by partitioning the original problem into sub-problems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of the two-class neural networks in order to obtain posterior probabilities for the class decisions. The resulting probabilistic pairwise classifier is part of a handwriting recognition system which is currently applied to check reading. We present results on real world data bases and show that, from a practical point of view, these results compare favorably to other neural network approaches.


A Mixture Model System for Medical and Machine Diagnosis

Neural Information Processing Systems

Diagnosis of human disease or machine fault is a missing data problem since many variables are initially unknown. Additional information needs to be obtained. The j oint probability distribution of the data can be used to solve this problem. We model this with mixture models whose parameters are estimated by the EM algorithm. This gives the benefit that missing data in the database itself can also be handled correctly. The request for new information to refine the diagnosis is performed using the maximum utility principle. Since the system is based on learning it is domain independent and less labor intensive than expert systems or probabilistic networks. An example using a heart disease database is presented.


Diffusion of Credit in Markovian Models

Neural Information Processing Systems

This paper studies the problem of diffusion in Markovian models, such as hidden Markov models (HMMs) and how it makes very difficult the task of learning of long-term dependencies in sequences. Using results from Markov chain theory, we show that the problem of diffusion is reduced if the transition probabilities approach 0 or 1. Under this condition, standard HMMs have very limited modeling capabilities, but input/output HMMs can still perform interesting computations.


The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System

Neural Information Processing Systems

This system combines a robust input representation, which preserves the dynamic writing information, with a neural network architecture, a so called Multi-State Time Delay Neural Network (MS-TDNN), which integrates rec.ognition and segmentation ina single framework. Our preprocessing transforms the original coordinate sequence into a (still temporal) sequence offeature vectors,which combine strictly local features, like curvature or writing direction, with a bitmap-like representation of the coordinate's proximity.The MS-TDNN architecture is well suited for handling temporal sequences as provided by this input representation. Oursystem is tested both on writer dependent and writer independent tasks with vocabulary sizes ranging from 400 up to 20,000 words. For example, on a 20,000 word vocabulary we achieve word recognition rates up to 88.9% (writer dependent) and 84.1 % (writer independent) without using any language models.


Inferring Ground Truth from Subjective Labelling of Venus Images

Neural Information Processing Systems

In practical situations, experts may visually examine the images and provide a subjective noisy estimate of the truth. Calibrating the reliability and bias of expert labellers is a nontrivial problem. In this paper we discuss some of our recent work on this topic in the context of detecting small volcanoes in Magellan SAR images of Venus. Empirical results (using the Expectation-Maximization procedure) suggest that accounting for subjective noise can be quite significant interms of quantifying both human and algorithm detection performance.