Technology
Laterally Interconnected Self-Organizing Maps in Hand-Written Digit Recognition
Choe, Yoonsuck, Sirosh, Joseph, Miikkulainen, Risto
An application of laterally interconnected self-organizing maps (LISSOM) to handwritten digit recognition is presented. The resulting excitatory connections focus the activity into local patches and the inhibitory connections decorrelate redundant activityon the map. The map thus forms internal representations thatare easy to recognize with e.g. a perceptron network. The recognition rate on a subset of NIST database 3 is 4.0% higher with LISSOM than with a regular Self-Organizing Map (SOM) as the front end, and 15.8% higher than recognition of raw input bitmaps directly. These results form a promising starting point for building pattern recognition systems with a LISSOM map as a front end. 1 Introduction Handwritten digit recognition has become one of the touchstone problems in neural networks recently.
Onset-based Sound Segmentation
A technique for segmenting sounds using processing based on mammalian earlyauditory processing is presented. The technique is based on features in sound which neuron spike recording suggests are detected in the cochlear nucleus. The sound signal is bandpassed andeach signal processed to enhance onsets and offsets. The onset and offset signals are compressed, then clustered both in time and across frequency channels using a network of integrateand-fire neurons.Onsets and offsets are signalled by spikes, and the timing of these spikes used to segment the sound. 1 Background Traditional speech interpretation techniques based on Fourier transforms, spectrum recoding, and a hidden Markov model or neural network interpretation stage have limitations both in continuous speech and in interpreting speech in the presence of noise, and this has led to interest in front ends modelling biological auditory systems for speech interpretation systems (Ainsworth and Meyer 92; Cosi 93; Cole et al 95). Auditory modelling systems use similar early auditory processing to that used in biological systems.
Parallel analog VLSI architectures for computation of heading direction and time-to-contact
Indiveri, Giacomo, Kramer, Jรถrg, Koch, Christof
To exploit their properties at a system level, we developed parallel image processing architectures forapplications that rely mostly on the qualitative properties of the optical flow, rather than on the precise values of the velocity vectors. Specifically, we designed twoparallel architectures that employ arrays of elementary motion sensors for the computation of heading direction and time-to-contact. The application domain thatwe took into consideration for the implementation of such architectures, is the promising one of vehicle navigation. Having defined the types of images to be analyzed and the types of processing to perform, we were able to use a priori infor- VLSI Architectures for Computation of Heading Direction and Time-to-contact 721 mation to integrate selectively the sparse data obtained from the velocity sensors and determine the qualitative properties of the optical flow field of interest.
Model Matching and SFMD Computation
Rehfuss, Steven, Hammerstrom, Dan W.
In systems that process sensory data there is frequently a model matching stage where class hypotheses are combined to recognize a complex entity. We introduce a new model of parallelism, the Single Function Multiple Data (SFMD) model, appropriate to this stage. SFMD functionality can be added with small hardware expense to certain existing SIMD architectures, and as an incremental addition to the programming model. Adding SFMD to an SIMD machine will not only allow faster model matching, but also increase its flexibility as a general purpose machine and its scope in performing the initial stages of sensory processing. 1 INTRODUCTION In systems that process sensory data there is frequently a post-classification stage where several independent class hypotheses are combined into the recognition of a more complex entity. Examples include matching word models with a string of observation probabilities, and matching visual object models with collections of edges or other features. Current parallel computer architectures for processing sensory data focus on the classification and pre-classification stages (Hammerstrom 1990).This is reasonable, as those stages likely have the largest potential for speedup through parallel execution. Nonetheless, the model-matching stage is also suitable for parallelism, as each model may be matched independently of the others. We introduce a new style of parallelism, Single Function Multiple Data (SFMD), that is suitable for the model-matching stage.
VLSI Model of Primate Visual Smooth Pursuit
Etienne-Cummings, Ralph, Spiegel, Jan Van der, Mueller, Paul
A one dimensional model of primate smooth pursuit mechanism has been implemented in 2 11m CMOS VLSI. The model consolidates Robinson's negative feedback model with Wyatt and Pola's positive feedback scheme, to produce a smooth pursuit system which zero's the velocity of a target on the retina. Furthermore, the system uses the current eye motion as a predictor for future target motion. Analysis, stability and biological correspondence of the system are discussed. For implementation at the focal plane, a local correlation based visual motion detection technique is used. Velocity measurements, ranging over 4 orders of magnitude with 15% variation, provides the input to the smooth pursuit system. The system performed successful velocity tracking for high contrast scenes. Circuit design and performance of the complete smooth pursuit system is presented.
Silicon Models for Auditory Scene Analysis
Lazzaro, John, Wawrzynek, John
We are developing special-purpose, low-power analog-to-digital converters for speech and music applications, that feature analog circuit models of biological audition to process the audio signal before conversion. This paper describes our most recent converter design, and a working system that uses several copies ofthe chip to compute multiple representations of sound from an analog input. This multi-representation system demonstrates the plausibility of inexpensively implementing an auditory scene analysis approach to sound processing. 1. INTRODUCTION The visual system computes multiple representations of the retinal image, such as motion, orientation, and stereopsis, as an early step in scene analysis. Likewise, the auditory brainstem computes secondary representations of sound, emphasizing properties such as binaural disparity, periodicity, and temporal onsets. Recent research in auditory scene analysis involves using computational models of these auditory brainstem representations in engineering applications. Computation is a major limitation in auditory scene analysis research: the complete auditoryprocessing system described in (Brown and Cooke, 1994) operates at approximately 4000 times real time, running under UNIX on a Sun SPARCstation 1. Standard approaches to hardware acceleration for signal processing algorithms could be used to ease this computational burden in a research environment; a variety of parallel, fixed-point hardware products would work well on these algorithms.
Neuron-MOS Temporal Winner Search Hardware for Fully-Parallel Data Processing
Shibata, Tadashi, Nakai, Tsutomu, Morimoto, Tatsuo, Kaihara, Ryu, Yamashita, Takeo, Ohmi, Tadahiro
Search for the largest (or the smallest) among a number of input data, Le., the winner-take-all (WTA) action, is an essential part of intelligent data processing such as data retrieval in associative memories [3], vector quantization circuits [4], Kohonen's self-organizing maps [5] etc. In addition to the maximum or minimum search, data sorting also plays an essential role in a number of signal processing such as median filtering in image processing, evolutionary algorithms in optimizing problems [6] and so forth.
Does the Wake-sleep Algorithm Produce Good Density Estimators?
Frey, Brendan J., Hinton, Geoffrey E., Dayan, Peter
The wake-sleep algorithm (Hinton, Dayan, Frey and Neal 1995) is a relatively efficientmethod of fitting a multilayer stochastic generative model to high-dimensional data. In addition to the top-down connections inthe generative model, it makes use of bottom-up connections for approximating the probability distribution over the hidden units given the data, and it trains these bottom-up connections using a simple delta rule. We use a variety of synthetic and real data sets to compare the performance ofthe wake-sleep algorithm with Monte Carlo and mean field methods for fitting the same generative model and also compare it with other models that are less powerful but easier to fit. 1 INTRODUCTION Neural networks are often used as bottom-up recognition devices that transform input vectors intorepresentations of those vectors in one or more hidden layers. But multilayer networks ofstochastic neurons can also be used as top-down generative models that produce patterns with complicated correlational structure in the bottom visible layer. In this paper we consider generative models composed of layers of stochastic binary logistic units. Given a generative model parameterized by top-down weights, there is an obvious way to perform unsupervised learning. The generative weights are adjusted to maximize the probability thatthe visible vectors generated by the model would match the observed data.
Using Unlabeled Data for Supervised Learning
Geoffrey Towell Siemens Corporate Research 755 College Road East Princeton, NJ 08540 Abstract Many classification problems have the property that the only costly part of obtaining examples is the class label. This paper suggests a simple method for using distribution information contained in unlabeled examples to augment labeled examples in a supervised training framework. Empirical tests show that the technique described inthis paper can significantly improve the accuracy of a supervised learner when the learner is well below its asymptotic accuracy level. 1 INTRODUCTION Supervised learning problems often have the following property: unlabeled examples have little or no cost while class labels have a high cost. For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats.
Is Learning The n-th Thing Any Easier Than Learning The First?
This paper investigates learning in a lifelong context. Lifelong learning addresses situations in which a learner faces a whole stream of learning tasks.Such scenarios provide the opportunity to transfer knowledge across multiple learning tasks, in order to generalize more accurately from less training data. In this paper, several different approaches to lifelong learning are described, and applied in an object recognition domain. It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks. 1 Introduction Supervised learning is concerned with approximating an unknown function based on examples. Virtuallyall current approaches to supervised learning assume that one is given a set of input-output examples, denoted by X, which characterize an unknown function, denoted by f.