Goto

Collaborating Authors

 Bridle, John S.


Unsupervised Classifiers, Mutual Information and 'Phantom Targets

Neural Information Processing Systems

We derive criteria for training adaptive classifier networks to perform unsupervised data analysis. The first criterion turns a simple Gaussian classifier into a simple Gaussian mixture analyser. The second criterion, which is much more generally applicable, is based on mutual information.


Unsupervised Classifiers, Mutual Information and 'Phantom Targets

Neural Information Processing Systems

We derive criteria for training adaptive classifier networks to perform unsupervised data analysis. The first criterion turns a simple Gaussian classifier into a simple Gaussian mixture analyser. The second criterion, which is much more generally applicable, is based on mutual information.


Unsupervised Classifiers, Mutual Information and 'Phantom Targets

Neural Information Processing Systems

We derive criteria for training adaptive classifier networks to perform unsupervised dataanalysis. The first criterion turns a simple Gaussian classifier into a simple Gaussian mixture analyser. The second criterion, which is much more generally applicable, is based on mutual information.


RecNorm: Simultaneous Normalisation and Classification applied to Speech Recognition

Neural Information Processing Systems

A particular form of neural network is described, which has terminals for acoustic patterns, class labels and speaker parameters. A method of training this network to "tune in" the speaker parameters to a particular speaker is outlined, based on a trick for converting a supervised network to an unsupervised mode. We describe experiments using this approach in isolated word recognition based on whole-word hidden Markov models. The results indicate an improvement over speaker-independent performance and, for unlabelled data, a performance close to that achieved on labelled data. 1 INTRODUCTION We are concerned to emulate some aspects of perception. In particular, the way that a stimulus which is ambiguous, perhaps because of unknown lighting conditions, can become unambiguous in the context of other such stimuli: the fact that they are subject to tbe same unknown conditions gives our perceptual apparatus enough constraints to solve tbe problem. Individual words are often ambiguous even to human listeners. For instance a Cockney might say the word "ace" to sound the same as a Standard English speaker's "ice". Similarly with "room" and "rum", or "work" and "walk" ill other pairs of British English accents. If we heard one of these ambiguous pronunciations, knowing nothing else about the speaker we could not tell which word had been said.


RecNorm: Simultaneous Normalisation and Classification applied to Speech Recognition

Neural Information Processing Systems

A particular form of neural network is described, which has terminals for acoustic patterns, class labels and speaker parameters. A method of training this network to "tune in" the speaker parameters to a particular speaker is outlined, based on a trick for converting a supervised network to an unsupervised mode. We describe experiments using this approach in isolated word recognition based on whole-word hidden Markov models. The results indicate an improvement over speaker-independent performance and,for unlabelled data, a performance close to that achieved on labelled data. 1 INTRODUCTION We are concerned to emulate some aspects of perception. In particular, the way that a stimulus which is ambiguous, perhaps because of unknown lighting conditions, can become unambiguous in the context of other such stimuli: the fact that they are subject to tbe same unknown conditions gives our perceptual apparatus enough constraints to solve tbe problem.


Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters

Neural Information Processing Systems

One of the attractions of neural network approaches to pattern recognition is the use of a discrimination-based training method. We show that once we have modified the output layer of a multilayer perceptron to provide mathematically correct probability distributions, and replaced the usual squared error criterion with a probability-based score, the result is equivalent to Maximum Mutual Information training, which has been used successfully to improve the performance of hidden Markov models for speech recognition. If the network is specially constructed to perform the recognition computations of a given kind of stochastic model based classifier then we obtain a method for discrimination-based training of the parameters of the models. Examples include an HMM-based word discriminator, which we call an'Alphanet'.


Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters

Neural Information Processing Systems

One of the attractions of neural network approaches to pattern recognition is the use of a discrimination-based training method. We show that once we have modified the output layer of a multilayer perceptronto provide mathematically correct probability distributions, andreplaced the usual squared error criterion with a probability-based score, the result is equivalent to Maximum Mutual Informationtraining, which has been used successfully to improve theperformance of hidden Markov models for speech recognition. Ifthe network is specially constructed to perform the recognition computations of a given kind of stochastic model based classifier then we obtain a method for discrimination-based training of the parameters of the models. Examples include an HMM-based word discriminator, which we call an'Alphanet'.