Goto

Collaborating Authors

 Industry


Compositionality, MDL Priors, and Object Recognition

Neural Information Processing Systems

Images are ambiguous at each of many levels of a contextual hierarchy. Nevertheless, the high-level interpretation of most scenes is unambiguous, as evidenced by the superior performance of humans. This observation argues for global vision models, such as deformable templates. Unfortunately, such models are computationally intractable for unconstrained problems. We propose a compositional model in which primitives are recursively composed, subject to syntactic restrictions, to form tree-structured objects and object groupings. Ambiguity is propagated up the hierarchy in the form of multiple interpretations, which are later resolved by a Bayesian, equivalently minimum-description-Iength, cost functional.


Edges are the 'Independent Components' of Natural Scenes.

Neural Information Processing Systems

Field (1994) has suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and Barlow (1989) has reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that nonlinear'infomax', when applied to an ensemble of natural scenes, produces sets of visual filters that are localised and oriented. Some of these filters are Gabor-like and resemble those produced by the sparseness-maximisation network of Olshausen & Field (1996). In addition, the outputs of these filters are as independent as possible, since the infomax network is able to perform Independent Components Analysis (ICA). We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zero-phase whitening filters (ZCA).


Learning Temporally Persistent Hierarchical Representations

Neural Information Processing Systems

A biologically motivated model of cortical self-organization is proposed. Context is combined with bottom-up information via a maximum likelihood cost function. Clusters of one or more units are modulated by a common contextual gating Signal; they thereby organize themselves into mutually supportive predictors of abstract contextual features. The model was tested in its ability to discover viewpoint-invariant classes on a set of real image sequences of centered, gradually rotating faces. It performed considerably better than supervised back-propagation at generalizing to novel views from a small number of training examples.


Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks

Neural Information Processing Systems

We have explored two approaches to recogmzmg faces across changes in pose. First, we developed a representation of face images based on independent component analysis (ICA) and compared it to a principal component analysis (PCA) representation for face recognition. The ICA basis vectors for this data set were more spatially local than the PCA basis vectors and the ICA representation had greater invariance to changes in pose. Second, we present a model for the development of viewpoint invariant responses to faces from visual experience in a biological system. The temporal continuity of natural visual experience was incorporated into an attractor network model by Hebbian learning following a lowpass temporal filter on unit activities.


Effective Training of a Neural Network Character Classifier for Word Recognition

Neural Information Processing Systems

We have been conducting research on bottom-up classification techniques ba;ed on trainable artificial neural networks (ANNs), in combination with comprehensive but weakly-applied language models. To focus our work on a subproblem that is tractable enough to le.:'ld to usable products in a reasonable time, we have restricted the domain to hand-printing, so that strokes are clearly delineated by pen lifts. In the process of optimizing overall performance of the recognizer, we have discovered some useful techniques for architecting and training ANNs that must participate in a larger recognition process. Some of these techniques-especially the normalization of output error, frequency balanCing, and error emphal;is-suggest a common theme of significant value derived by reducing the effect of a priori biases in training data to better represent low frequency, low probability smnples, including second and third choice probabilities. There is mnple prior work in combining low-level classifiers with various search strategies to provide integrated segmentation and recognition for writing (Tappert et al 1990) and speech (Renals et aI1992). And there is a rich background in the use of ANNs a-; classifiers, including their use as a low-level, character classifier in a higher-level word recognition system (Bengio et aI1995).


Ensemble Methods for Phoneme Classification

Neural Information Processing Systems

There is now considerable interest in using ensembles or committees of learning machines to improve the performance of the system over that of a single learning machine. In most neural network ensembles, the ensemble members are trained on either the same data (Hansen & Salamon 1990) or different subsets of the data (Perrone & Cooper 1993). The ensemble members typically have different initial conditions and/or different architectures. The subsets of the data may be chosen at random, with prior knowledge or by some principled approach e.g.


A Micropower Analog VLSI HMM State Decoder for Wordspotting

Neural Information Processing Systems

We describe the implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system. The key specification for this state decoder design is microwatt power dissipation; this requirement led to a continuoustime, analog circuit implementation. We characterize the operation of a 10-word (81 state) state decoder test chip.


Dynamically Adaptable CMOS Winner-Take-All Neural Network

Neural Information Processing Systems

The major problem that has prevented practical application of analog neuro-LSIs has been poor accuracy due to fluctuating analog device characteristics inherent in each device as a result of manufacturing. This paper proposes a dynamic control architecture that allows analog silicon neural networks to compensate for the fluctuating device characteristics and adapt to a change in input DC level. We have applied this architecture to compensate for input offset voltages of an analog CMOS WTA (Winner-Take-AlI) chip that we have fabricated. Experimental data show the effectiveness of the architecture.


Analog VLSI Circuits for Attention-Based, Visual Tracking

Neural Information Processing Systems

A one-dimensional visual tracking chip has been implemented using neuromorphic, analog VLSI techniques to model selective visual attention in the control of saccadic and smooth pursuit eye movements. The chip incorporates focal-plane processing to compute image saliency and a winner-take-all circuit to select a feature for tracking. The target position and direction of motion are reported as the target moves across the array. We demonstrate its functionality in a closed-loop system which performs saccadic and smooth pursuit tracking movements using a one-dimensional mechanical eye.


A Spike Based Learning Neuron in Analog VLSI

Neural Information Processing Systems

Many popular learning rules are formulated in terms of continuous, analog inputs and outputs. Biological systems, however, use action potentials, which are digital-amplitude events that encode analog information in the inter-event interval. Action-potential representations are now being used to advantage in neuromorphic VLSI systems as well. We report on a simple learning rule, based on the Riccati equation described by Kohonen [1], modified for action-potential neuronal outputs. We demonstrate this learning rule in an analog VLSI chip that uses volatile capacitive storage for synaptic weights. We show that our time-dependent learning rule is sufficient to achieve approximate weight normalization and can detect temporal correlations in spike trains.