Goto

Collaborating Authors

 Learning Graphical Models


Learning Exact Patterns of Quasi-synchronization among Spiking Neurons from Data on Multi-unit Recordings

Neural Information Processing Systems

This paper develops arguments for a family of temporal log-linear models to represent spatiotemporal correlations among the spiking events in a group of neurons. The models can represent not just pairwise correlations but also correlations of higher order. Methods are discussed for inferring the existence or absence of correlations and estimating their strength. A frequentist and a Bayesian approach to correlation detection are compared.


Approximate Solutions to Optimal Stopping Problems

Neural Information Processing Systems

We propose and analyze an algorithm that approximates solutions to the problem of optimal stopping in a discounted irreducible aperiodic Markov chain. The scheme involves the use of linear combinations of fixed basis functions to approximate a Q-function. The weights of the linear combination are incrementally updated through an iterative process similar to Q-Iearning, involving simulation of the underlying Markov chain. Due to space limitations, we only provide an overview of a proof of convergence (with probability 1) and bounds on the approximation error. This is the first theoretical result that establishes the soundness of a Q-Iearninglike algorithm when combined with arbitrary linear function approximators to solve a sequential decision problem.


Analysis of Temporal-Diffference Learning with Function Approximation

Neural Information Processing Systems

We present new results about the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of a Markov chain using linear function approximators. The algorithm we analyze performs online updating of a parameter vector during a single endless trajectory of an aperiodic irreducible finite state Markov chain. Results include convergence (with probability 1), a characterization of the limit of convergence, and a bound on the resulting approximation error. In addition to establishing new and stronger results than those previously available, our analysis is based on a new line of reasoning that provides new intuition about the dynamics of temporal-difference learning. Furthermore, we discuss the implications of two counterexamples with regards to the Significance of online updating and linearly parameterized function approximators. 1 INTRODUCTION The problem of predicting the expected long-term future cost (or reward) of a stochastic dynamic system manifests itself in both time-series prediction and control.


Local Bandit Approximation for Optimal Learning Problems

Neural Information Processing Systems

A Bayesian formulation of the problem leads to a clear concept of a solution whose computation, however, appears to entail an examination of an intractably-large number of hyperstates. This paper has suggested extending the Gittins index approach (which applies with great power and elegance to the special class of multi-armed bandit processes) to general adaptive MDP's. The hope has been that if certain salient features of the value of information could be captured, even approximately, then one could be led to a reasonable method for avoiding certain defects of certainty-equivalence approaches (problems with identifiability, "metastability"). Obviously, positive evidence, in the form of empirical results from simulation experiments, would lend support to these ideas-work along these lines is underway. Local bandit approximation is but one approximate computational approach for problems of optimal learning and dual control. Most prominent in the literature of control theory is the "wide-sense" approach of [Bar-Shalom & Tse, 1976], which utilizes local quadratic approximations about nominal state/control trajectories. For certain problems, this method has demonstrated superior performance compared to a certainty-equivalence approach, but it is computationally very intensive and unwieldy, particularly for problems with controller dimension greater than one. One could revert to the view of the bandit problem, or general adaptive MDP, as simply a very large MDP defined over hyperstates, and then consider a some- Local Bandit Approximationfor Optimal Learning Problems 1025 what direct approach in which one performs approximate dynamic programming with function approximation over this domain-details of function-approximation, feature-selection, and "training" all become important design issues.


Contour Organisation with the EM Algorithm

Neural Information Processing Systems

This paper describes how the early visual process of contour organisation can be realised using the EM algorithm. The underlying computational representation is based on fine spline coverings. According to our EM approach the adjustment of spline parameters draws on an iterative weighted least-squares fitting process. The expectation step of our EM procedure computes the likelihood of the data using a mixture model defined over the set of spline coverings. These splines are limited in their spatial extent using Gaussian windowing functions.


Compositionality, MDL Priors, and Object Recognition

Neural Information Processing Systems

Images are ambiguous at each of many levels of a contextual hierarchy. Nevertheless, the high-level interpretation of most scenes is unambiguous, as evidenced by the superior performance of humans. This observation argues for global vision models, such as deformable templates. Unfortunately, such models are computationally intractable for unconstrained problems. We propose a compositional model in which primitives are recursively composed, subject to syntactic restrictions, to form tree-structured objects and object groupings. Ambiguity is propagated up the hierarchy in the form of multiple interpretations, which are later resolved by a Bayesian, equivalently minimum-description-Iength, cost functional.


A New Approach to Hybrid HMM/ANN Speech Recognition using Mutual Information Neural Networks

Neural Information Processing Systems

This paper presents a new approach to speech recognition with hybrid HMM/ANN technology. While the standard approach to hybrid HMMI ANN systems is based on the use of neural networks as posterior probability estimators, the new approach is based on the use of mutual information neural networks trained with a special learning algorithm in order to maximize the mutual information between the input classes of the network and its resulting sequence of firing output neurons during training. It is shown in this paper that such a neural network is an optimal neural vector quantizer for a discrete hidden Markov model system trained on Maximum Likelihood principles. One of the main advantages of this approach is the fact, that such neural networks can be easily combined with HMM's of any complexity with context-dependent capabilities. It is shown that the resulting hybrid system achieves very high recognition rates, which are now already on the same level as the best conventional HMM systems with continuous parameters, and the capabilities of the mutual information neural networks are not yet entirely exploited.


Dynamic Features for Visual Speechreading: A Systematic Comparison

Neural Information Processing Systems

Humans use visual as well as auditory speech signals to recognize spoken words. A variety of systems have been investigated for performing this task. The main purpose of this research was to systematically compare the performance of a range of dynamic visual features on a speechreading task. We have found that normalization of images to eliminate variation due to translation, scale, and planar rotation yielded substantial improvements in generalization performance regardless of the visual representation used. In addition, the dynamic information in the difference between successive frames yielded better performance than optical-flow based approaches, and compression by local low-pass filtering worked surprisingly better than global principal components analysis (PCA). These results are examined and possible explanations are explored.


A Micropower Analog VLSI HMM State Decoder for Wordspotting

Neural Information Processing Systems

We describe the implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system. The key specification for this state decoder design is microwatt power dissipation; this requirement led to a continuoustime, analog circuit implementation. We characterize the operation of a 10-word (81 state) state decoder test chip.


Clustering Sequences with Hidden Markov Models

Neural Information Processing Systems

This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Markov models (HMMs). The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experimental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.