Goto

Collaborating Authors

 Technology






Speaker Recognition Using Neural Tree Networks

Neural Information Processing Systems

A new classifier is presented for text-independent speaker recognition. The new classifier is called the modified neural tree network (MNTN). The NTN is a hierarchical classifier that combines the properties of decision trees and feed-forward neural networks. The MNTN differs from the standard NTNin that a new learning rule based on discriminant learning is used, which minimizes the classification error as opposed to a norm of the approximation error. The MNTN also uses leaf probability measures inaddition to the class labels.


Lipreading by neural networks: Visual preprocessing, learning, and sensory integration

Neural Information Processing Systems

Automated speech recognition is notoriously hard, and thus any predictive source of information and constraints that could be incorporated into a computer speech recognition system would be desirable. Humans, especially the hearing impaired, can utilize visual information - "speech reading" - for improved accuracy (Dodd & Campbell, 1987, Sanders & Goodrich, 1971). Speech reading can provide direct information about segments, phonemes, rate, speaker gender and identity, and subtle informationfor segmenting speech from background noise or multiple speakers (De Filippo & Sims, 1988, Green & Miller, 1985). Fundamental support for the use of visual information comes from the complementary natureof the visual and acoustic speech signals. Utterances that are difficult to distinguish acoustically are the easiest to distinguish.


Figure of Merit Training for Detection and Spotting

Neural Information Processing Systems

Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to training spotters is presented which computes the FOM gradient for each input pattern and then directly maximizes the FOM using backpropagation. This eliminates the need for thresholds during training. It also uses network resources to model Bayesian a posteriori probability functions accurately only for patterns which have a significant effect on the detection accuracy over the false alarm rate of interest.


Analysis of Short Term Memories for Neural Networks

Neural Information Processing Systems

Short term memory is indispensable for the processing of time varying information with artificial neural networks. In this paper a model for linear memories is presented, and ways to include memories in connectionist topologies are discussed. A comparison is drawn among different memory types, with indication of what is the salient characteristic of each memory model. 1 INTRODUCTION An adaptive system that has to interact with the external world is faced with the problem of coping with the time varying nature of real world signals. Time varying signals, natural or man made, carry information in their time structure. The problem is then one of devising methods and topologies (in the case of interest here, neural topologies) that explore information along time.This problem can be appropriately called temporal pattern recognition, as opposed to the more traditional case of static pattern recognition.


Dual Mechanisms for Neural Binding and Segmentation

Neural Information Processing Systems

We propose that the binding and segmentation of visual features is mediated by two complementary mechanisms; a low resolution, spatial-based,resource-free process and a high resolution, temporal-based, resource-limited process. In the visual cortex, the former depends upon the orderly topographic organization in striate andextrastriate areas while the latter may be related to observed temporalrelationships between neuronal activities. Computer simulations illustrate the role the two mechanisms play in figure/ ground discrimination, depth-from-occlusion, and the vividness ofperceptual completion. 1 COMPLEMENTARY BINDING MECHANISMS The "binding problem" is a classic problem in computational neuroscience which considers how neuronal activities are grouped to create mental representations. For the case of visual processing, the binding of neuronal activities requires a mechanism forselectively grouping fragmented visual features in order to construct the coherent representations (i.e.


Resolving motion ambiguities

Neural Information Processing Systems

We address the problem of optical flow reconstruction and in particular theproblem of resolving ambiguities near edges. They occur due to (i) the aperture problem and (ii) the occlusion problem, where pixels on both sides of an intensity edge are assigned the same velocity estimates (and confidence). However, these measurements are correct for just one side of the edge (the non occluded one). Our approach is to introduce an uncertamty field with respect to the estimates and confidence measures. We note that the confidence measuresare large at intensity edges and larger at the convex sides of the edges, i.e. inside corners, than at the concave side. We resolve the ambiguities through local interactions via coupled Markov random fields (MRF). The result is the detection of motion for regions of images with large global convexity.