Goto

Collaborating Authors

 Technology


A Model of Auditory Streaming

Neural Information Processing Systems

The formation of associations between signals, which are considered to arise from the same external source, allows the organism to recognise significant patterns and relationships within the signals from each source without being confused by accidental coincidences between unrelated signals (Bregman, 1990). The intrinsically temporal nature of sound means that in addition to being able to focus on the signal of interest, perhaps of equal significance, is the ability to predict how that signal is expected to progress; such expectations can then be used to facilitate further processing of the signal. It is important to remember that perception is a creative act (Luria, 1980). The organism creates its interpretation of the world in response to the current stimuli, within the context of its current state of alertness, attention, and previous experience. The creative aspects of perception are exemplified in the auditory system where peripheral processing decomposes acoustic stimuli.


Modeling Saccadic Targeting in Visual Search

Neural Information Processing Systems

Visual cognition depends criticalIy on the ability to make rapid eye movements known as saccades that orient the fovea over targets of interest in a visual scene. Saccades are known to be ballistic: the pattern of muscle activation for foveating a prespecified target location is computed prior to the movement and visual feedback is precluded. Despite these distinctive properties, there has been no general model of the saccadic targeting strategy employed by the human visual system during visual search in natural scenes. This paper proposes a model for saccadic targeting that uses iconic scene representations derived from oriented spatial filters at multiple scales. Visual search proceeds in a coarse-to-fine fashion with the largest scale filter responses being compared first. The model was empirically tested by comparing its perfonnance with actual eye movement data from human subjects in a natural visual search task; preliminary results indicate substantial agreement between eye movements predicted by the model and those recorded from human subjects.


Classifying Facial Action

Neural Information Processing Systems

Measurement of facial expressions is important for research and assessment psychiatry, neurology,and experimental psychology (Ekman, Huang, Sejnowski, & Hager, 1992), and has technological applications in consumer-friendly user interfaces, interactive videoand entertainment rating. The Facial Action Coding System (FACS) is a method for measuring facial expressions in terms of activity in the underlying facial muscles (Ekman & Friesen, 1978). We are exploring ways to automate FACS.


A Framework for Non-rigid Matching and Correspondence

Neural Information Processing Systems

Matching feature point sets lies at the core of many approaches to object recognition. We present a framework for nonrigid matching thatbegins with a skeleton module, affine point matching, and then integrates multiple features to improve correspondence and develops an object representation based on spatial regions to model local transformations.


KODAK lMAGELINKโ„ข OCR Alphanumeric Handprint Module

Neural Information Processing Systems

There are two neural network algorithms at its cme: the first network is trained to find individual characters in an alphamuneric field, while the second one perfmns the classification. Both networks were trained on Gabor projections of the ociginal pixel images, which resulted in higher recognition rates and greater noise immunity. Compared to its purely numeric counterpart (Shusurovich and Thrasher, 1995), this version of the system has a significant applicatim specific postprocessing module. The system has been implemented in specialized parallel hardware, which allows it to run at 80 char/sec/board. It has been installed at the Driver and Vehicle Licensing Agency (DVLA) in the United Kingdom.


Selective Attention for Handwritten Digit Recognition

Neural Information Processing Systems

Completely parallel object recognition is NPcomplete. Achieving a recognizer with feasible complexity requires a compromise between paralleland sequential processing where a system selectively focuses on parts of a given image, one after another. Successive fixations are generated to sample the image and these samples are processed and abstracted to generate a temporal context in which results are integrated over time. A computational model based on a partially recurrent feedforward network is proposed and made credible bytesting on the real-world problem of recognition of handwritten digitswith encouraging results.


Handwritten Word Recognition using Contextual Hybrid Radial Basis Function Network/Hidden Markov Models

Neural Information Processing Systems

A hybrid and contextual radial basis function networklhidden Markov model off-line handwritten word recognition system is presented. The task assigned to the radial basis function networks is the estimation of emission probabilities associated to Markov states. The model is contextual becausethe estimation of emission probabilities takes into account the left context of the current image segment as represented by its predecessor inthe sequence. The new system does not outperform the previous system without context but acts differently.


A New Learning Algorithm for Blind Signal Separation

Neural Information Processing Systems

A new online learning algorithm which minimizes a statistical dependency amongoutputs is derived for blind separation of mixed signals. The dependency is measured by the average mutual information (MI)of the outputs. The source signals and the mixing matrix are unknown except for the number of the sources. The Gram-Charlier expansion instead of the Edgeworth expansion is used in evaluating the MI. The natural gradient approach is used to minimize the MI. A novel activation function is proposed for the online learning algorithm which has an equivariant property and is easily implemented on a neural network like model. The validity of the new learning algorithm are verified by computer simulations.


Context-Dependent Classes in a Hybrid Recurrent Network-HMM Speech Recognition System

Neural Information Processing Systems

A method for incorporating context-dependent phone classes in a connectionist-HMM hybrid speech recognition system is introduced. Amodular approach is adopted, where single-layer networks discriminate between different context classes given the phone class and the acoustic data. The context networks are combined with a context-independent (CI) network to generate context-dependent (CD) phone probability estimates. Experiments show an average reduction in word error rate of 16% and 13% from the CI system on ARPA 5,000 word and SQALE 20,000 word tasks respectively. Due to improved modelling, the decoding speed of the CD system is more than twice as fast as the CI system.


Forward-backward retraining of recurrent neural networks

Neural Information Processing Systems

This paper describes the training of a recurrent neural network as the letter posterior probability estimator for a hidden Markov model, off-line handwriting recognition system. The network estimates posteriordistributions for each of a series of frames representing sectionsof a handwritten word. The supervised training algorithm, backpropagation through time, requires target outputs to be provided for each frame. Three methods for deriving these targets are presented. A novel method based upon the forwardbackward algorithmis found to result in the recognizer with the lowest error rate. 1 Introduction In the field of off-line handwriting recognition, the goal is to read a handwritten document and produce a machine transcription.