Country
Optimal Movement Primitives
Terence D. Sanger Jet Propulsion Laboratory MS 303-310 4800 Oak Grove Drive Pasadena, CA 91109 (818) 354-9127 tds@ai.mit.edu Abstract The theory of Optimal Unsupervised Motor Learning shows how a network can discover a reduced-order controller for an unknown nonlinear system by representing only the most significant modes. Here, I extend the theory to apply to command sequences, so that the most significant components discovered by the network correspond tomotion "primitives". Combinations of these primitives can be used to produce a wide variety of different movements. I demonstrate applications to human handwriting decomposition and synthesis, as well as to the analysis of electrophysiological experiments on movements resulting from stimulation of the frog spinal cord. 1 INTRODUCTION There is much debate within the neuroscience community concerning the internal representationof movement, and current neurophysiological investigations are aimed at uncovering these representations. In this paper, I propose a different approach that attempts to define the optimal internal representation in terms of "movement primitives", and I compare this representation with the observed behavior.
Recognizing Handwritten Digits Using Mixtures of Linear Models
Hinton, Geoffrey E., Revow, Michael, Dayan, Peter
We construct a mixture of locally linear generative models of a collection ofpixel-based images of digits, and use them for recognition. Different models of a given digit are used to capture different styles of writing, and new images are classified by evaluating their log-likelihoods under each model. We use an EMbased algorithm in which the M-step is computationally straightforward principal components analysis (PCA). Incorporating tangent-plane information [12]about expected local deformations only requires adding tangent vectors into the sample covariance matrices for the PCA, and it demonstrably improves performance.
Real-Time Control of a Tokamak Plasma Using Neural Networks
Bishop, Chris M., Haynes, Paul S., Smith, Mike E U, Todd, Tom N., Trotman, David L., Windsor, Colin G.
This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a tokamak fusion experiment. The tokamak is currently the principal experimentaldevice for research into the magnetic confinement approachto controlled fusion. In the tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a timescale of a few tens of microseconds. Softwaresimulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multilayer perceptron, using a hybrid of digital and analogue technology, has been developed.
Transformation Invariant Autoassociation with Application to Handwritten Character Recognition
Schwenk, Holger, Milgram, Maurice
When training neural networks by the classical backpropagation algorithm thewhole problem to learn must be expressed by a set of inputs and desired outputs. However, we often have high-level knowledge about the learning problem. In optical character recognition (OCR), for instance, weknow that the classification should be invariant under a set of transformations like rotation or translation. We propose a new modular classification system based on several autoassociative multilayer perceptrons whichallows the efficient incorporation of such knowledge. Results are reported on the NIST database of upper case handwritten letters and compared to other approaches to the invariance problem. 1 INCORPORATION OF EXPLICIT KNOWLEDGE The aim of supervised learning is to learn a mapping between the input and the output space from a set of example pairs (input, desired output). The classical implementation in the domain of neural networks is the backpropagation algorithm. If this learning set is sufficiently representative of the underlying data distributions, one hopes that after learning, the system is able to generalize correctly to other inputs of the same distribution.
PCA-Pyramids for Image Compression
First, we show that we can use neural networks in a pyramidal framework,yielding the so-called PCA pyramids. Then we present an image compression method based on the PCA pyramid, which is similar to the Laplace pyramid and wavelet transform. Some experimental results with real images are reported. Finally, we present a method to combine the quantization step with the learning of the PCA pyramid. 1 Introduction In the past few years, a lot of work has been done on using neural networks for image compression, d .
JPMAX: Learning to Recognize Moving Objects as a Model-fitting Problem
Suzanna Becker Department of Psychology, McMaster University Hamilton, Onto L8S 4K1 Abstract Unsupervised learning procedures have been successful at low-level feature extraction and preprocessing of raw sensor data. So far, however, they have had limited success in learning higher-order representations, e.g., of objects in visual images. A promising approach isto maximize some measure of agreement between the outputs of two groups of units which receive inputs physically separated inspace, time or modality, as in (Becker and Hinton, 1992; Becker, 1993; de Sa, 1993). Using the same approach, a much simpler learningprocedure is proposed here which discovers features in a single-layer network consisting of several populations of units, and can be applied to multi-layer networks trained one layer at a time. When trained with this algorithm on image sequences of moving geometric objects a two-layer network can learn to perform accurate position-invariant object classification. 1 LEARNING COHERENT CLASSIFICATIONS A powerful constraint in sensory data is coherence over time, in space, and across different sensory modalities.
Learning direction in global motion: two classes of psychophysically-motivated models
Sundareswaran, V., Vaina, Lucia M.
Perceptual learning is defined as fast improvement in performance and retention of the learned ability over a period of time. In a set of psychophysical experimentswe demonstrated that perceptual learning occurs for the discrimination of direction in stochastic motion stimuli. Here we model this learning using two approaches: a clustering model that learns to accommodate the motion noise, and an averaging model that learns to ignore the noise. Simulations of the models show performance similar to the psychophysical results. 1 Introduction Global motion perception is critical to many visual tasks: to perceive self-motion, to identify objects in motion, to determine the structure of the environment, and to make judgements for safe navigation. In the presence of noise, as in random dot kinematograms, efficient extraction of global motion involves considerable spatial integration. Newsome and Colleagues (1989) showed that neurons in the macaque middle temporal area (MT) are motion direction-selective, and perform global integration ofmotion in their large receptive fields. Psychophysical studies in humans have characterized the limits of spatial and temporal integration in motion (Watamaniuk et.aI, 1984) and the nature of the underlying motion computations (Vaina et.
Learning Saccadic Eye Movements Using Multiscale Spatial Filters
Rao, Rajesh P. N., Ballard, Dana H.
Such sensors realize the simultaneous needfor wide field-of-view and good visual acuity. One popular class of space-variant sensors is formed by log-polar sensors which have a small area near the optical axis of greatly increased resolution (the fovea) coupled with a peripheral region that witnesses a gradual logarithmic falloff in resolution as one moves radially outward. These sensors are inspired by similar structures found in the primate retina where one finds both a peripheral region of gradually decreasing acuity and a circularly symmetric area centmlis characterized by a greater density of receptors and a disproportionate representation in the optic nerve [3]. The peripheral region, though of low visual acuity, is more sensitive to light intensity and movement. The existence of a region optimized for discrimination and recognition surrounded by a region geared towards detection thus allows the image of an object of interest detected in the outer region to be placed on the more analytic center for closer scrutiny. Such a strategy however necessitates the existence of (a) methods to determine which location in the periphery to foveate next, and (b) fast gaze-shifting mechanisms to achieve this 894 RajeshP.
Using Voice Transformations to Create Additional Training Talkers for Word Spotting
Chang, Eric I., Lippmann, Richard P.
Lack of training data has always been a constraint in training speech recognizers. This research presentsa voice transformation technique which increases the variety among training talkers. The resulting more varied training set provided up to 2.9 percentage points of improvement in the figure of merit (average detection rate) of a high performance word spotter. This improvement is similar to the increase in performance provided by doubling the amount of training data (Carlson, 1994). This technique can also be applied to other speech recognition systems such as continuous speech recognition, talker identification, and isolated speech recognition.
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based)to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically richsentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, themodel included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.