Undirected Networks
Transforming Neural-Net Output Levels to Probability Distributions
This problem can be solved by treating the trained network as a preprocessor that produces a feature vector that can be further processed, for instance by classical statistical estimation techniques. It is particularly useful to combine these two ideas: we implement the ideas of section 1 using Parzen windows, where the shape and relative size of each window is computed using the ideas of section 2. This allows us to make contact between important theoretical ideas (e.g. the ensemble formalism) and practical techniques (e.g. Our results also shed new light on and generalize the well-known "soft max" scheme. In many neural-net applications, it is crucial to produce a set of C numbers that serve as estimates of the probability of C mutually exclusive outcomes. For exam(cid:173) ple, in speech recognition, these numbers represent the probability of C different phonemes; the probabilities of successive segments can be combined using a Hidden Markov Model.
Markov Random Fields Can Bridge Levels of Abstraction
Network vision systems must make inferences from evidential informa(cid:173) tion across levels of representational abstraction, from low level invariants, through intermediate scene segments, to high level behaviorally relevant object descriptions. This paper shows that such networks can be realized as Markov Random Fields (MRFs). We show first how to construct an MRF functionally equivalent to a Hough transform parameter network, thus establishing a principled probabilistic basis for visual networks. Sec(cid:173) ond, we show that these MRF parameter networks are more capable and flexible than traditional methods. In particular, they have a well-defined probabilistic interpretation, intrinsically incorporate feedback, and offer richer representations and decision capabilities.
Time-Warping Network: A Hybrid Framework for Speech Recognition
Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we define a time-warping (1W) neuron that extends the operation of the fonnal neuron of a back-propagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMM(cid:173) the based discriminative power of this system by using back-propagation discriminative training. The results indicate that not only does the recognition performance improve.
Fault Diagnosis of Antenna Pointing Systems using Hybrid Neural Network and Signal Processing Models
The Deep Space Network (DSN) (designed and operated by the Jet Propulsion Lab(cid:173) oratory (JPL) for the National Aeronautics and Space Administration (NASA)) is unique in terms of providing end-to-end telecommunication capabilities between earth and various interplanetary spacecraft throughout the solar system. The ground component of the DSN consists of three ground station complexes located in California, Spain and Australia, giving full 24-hour coverage for deep space com(cid:173) munications.
Connectionist Optimisation of Tied Mixture Hidden Markov Models
Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of ra(cid:173) dial basis functions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some out(cid:173) standing problems with discriminative training.
Improved Hidden Markov Model Speech Recognition Using Radial Basis Function Networks
A high performance speaker-independent isolated-word hybrid speech rec(cid:173) ognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition ex(cid:173) periments using a speaker-independent E-set database, the hybrid rec(cid:173) ognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid system was based. These results and additional experiments demonstrate that RBF networks can be successfully incorporated in hybrid recognizers and sug(cid:173) gest that they may be capable of good performance with fewer parameters than required by Gaussian mixture classifiers. A global parameter opti(cid:173) mization method designed to minimize the overall word error rather than the frame recognition error failed to reduce the error rate. A hybrid isolated-word speech recognizer was developed which combines neural network and Hidden Markov Model (HMM) approaches.
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation
The subject of this paper is the integration of multi-layered Artificial Neu(cid:173) ral Networks (ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs.
Context-Dependent Multiple Distribution Phonetic Modeling with MLPs
A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM:) speech recognition systems have been developed in recent years (Morgan and Bourlard. The new training procedure smooths MLPs trained at different degrees of context dependence in order to obtain a robust estimate of the cootext-dependent probabilities. Tests with the DARPA Resomce Management database have shown substantial advantages of the context-dependent MLPs over earlier cootext(cid:173) independent MLPs.
Time Warping Invariant Neural Networks
Although TWINN is a simple modifica(cid:173) tion of well known recurrent neural network, analysis has shown that TWINN com(cid:173) pletely removes time warping and is able to handle difficult classification problem. This may help to understand the well accepted fact that for learning grammatical reference with NNF A one had to start with very short strings in training set. The numerical example we used is a trajectory classification problem. With TWINN this problem has been learned in 100 iterations. For benchmark we also trained the exact same problem with TDNN and completely failed as expected.
Hidden Markov Model Induction by Bayesian Model Merging
Hidden Markov Models (HMMs) are a well-studied approach to the modelling of sequence data. HMMs can be viewed as a stochastic generalization of finite-state automata, where both the transitions between states and the generation of output symbols are governed by probability distributions. HMMs have been important in speech recognition (Rabiner & Juang, 1986), cryptography, and more recently in other areas such as protein classification and alignment (Haussler, Krogh, Mian & SjOlander, 1992; Baldi, Chauvin, Hunkapiller & McClure, 1993). Practitioners have typically chosen the HMM topology by hand, so that learning the HMM from sample data means estimating only a fixed number of model parameters. The standard approach is to find a maximum likelihood (ML) or maximum a posteriori probability (MAP) estimate of the HMM parameters.