Glove-TalkII: Mapping Hand Gestures to Speech Using Neural Networks

Neural Information Processing Systems

Glove-TaikII is a system which translates hand gestures to speech through an adaptive interface. Hand gestures are mapped continuously to 10 control parameters of a parallel formant speech synthesizer. The mapping allows the hand to act as an artificial vocal tract that produces speech in real time. This gives an unlimited vocabulary in addition to direct control of fundamental frequency and volume. Currently, the best version of Glove-TalkII uses several input devices (including a CyberGlove, a ContactGlove, a 3-space tracker, and a foot-pedal), a parallel formant speech synthesizer and 3 neural networks.


Learning Saccadic Eye Movements Using Multiscale Spatial Filters

Neural Information Processing Systems

Such sensors realize the simultaneous need for wide field-of-view and good visual acuity. One popular class of space-variant sensors is formed by log-polar sensors which have a small area near the optical axis of greatly increased resolution (the fovea) coupled with a peripheral region that witnesses a gradual logarithmic falloff in resolution as one moves radially outward. These sensors are inspired by similar structures found in the primate retina where one finds both a peripheral region of gradually decreasing acuity and a circularly symmetric area centmlis characterized by a greater density of receptors and a disproportionate representation in the optic nerve [3]. The peripheral region, though of low visual acuity, is more sensitive to light intensity and movement. The existence of a region optimized for discrimination and recognition surrounded by a region geared towards detection thus allows the image of an object of interest detected in the outer region to be placed on the more analytic center for closer scrutiny. Such a strategy however necessitates the existence of (a) methods to determine which location in the periphery to foveate next, and (b) fast gaze-shifting mechanisms to achieve this 894 Rajesh P. N. Rao, Dana H. Ballard


The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System

Neural Information Processing Systems

This system combines a robust input representation, which preserves the dynamic writing information, with a neural network architecture, a so called Multi-State Time Delay Neural Network (MS-TDNN), which integrates rec.ognition and segmentation in a single framework. Our preprocessing transforms the original coordinate sequence into a (still temporal) sequence offeature vectors, which combine strictly local features, like curvature or writing direction, with a bitmap-like representation of the coordinate's proximity. The MS-TDNN architecture is well suited for handling temporal sequences as provided by this input representation. Our system is tested both on writer dependent and writer independent tasks with vocabulary sizes ranging from 400 up to 20,000 words. For example, on a 20,000 word vocabulary we achieve word recognition rates up to 88.9% (writer dependent) and 84.1 % (writer independent) without using any language models.


Pulsestream Synapses with Non-Volatile Analogue Amorphous-Silicon Memories

Neural Information Processing Systems

This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a tokamak fusion experiment. The tokamak is currently the principal experimental device for research into the magnetic confinement approach to controlled fusion. In the tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a timescale of a few tens of microseconds. Software simulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multilayer perceptron, using a hybrid of digital and analogue technology, has been developed.


Phase-Space Learning

Neural Information Processing Systems

Existing recurrent net learning algorithms are inadequate. We introduce the conceptual framework of viewing recurrent training as matching vector fields of dynamical systems in phase space. Phasespace reconstruction techniques make the hidden states explicit, reducing temporal learning to a feed-forward problem. In short, we propose viewing iterated prediction [LF88] as the best way of training recurrent networks on deterministic signals. Using this framework, we can train multiple trajectories, insure their stability, and design arbitrary dynamical systems. 1 INTRODUCTION Existing general-purpose recurrent algorithms are capable of rich dynamical behavior. Unfortunately, straightforward applications of these algorithms to training fully-recurrent networks on complex temporal tasks have had much less success than their feedforward counterparts. For example, to train a recurrent network to oscillate like a sine wave (the "hydrogen atom" of recurrent learning), existing techniques such as Real Time Recurrent Learning (RTRL) [WZ89] perform suboptimally. Williams & Zipser trained a two-unit network with RTRL, with one teacher signal. One unit of the resulting network showed a distorted waveform, the other only half the desired amplitude.


A Critical Comparison of Models for Orientation and Ocular Dominance Columns in the Striate Cortex

Neural Information Processing Systems

More than ten of the most prominent models for the structure and for the activity dependent formation of orientation and ocular dominance columns in the striate cort( x have been evaluated. We implemented those models on parallel machines, we extensively explored parameter space, and we quantitatively compared model predictions with experimental data which were recorded optically from macaque striate cortex. In our contribution we present a summary of our results to date. Briefly, we find that (i) despite apparent differences, many models are based on similar principles and, consequently, make similar predictions, (ii) certain "pattern models" as well as the developmental "correlation-based learning" models disagree with the experimental data, and (iii) of the models we have investigated, "competitive Hebbian" models and the recent model of Swindale provide the best match with experimental data. 1 Models and Data The models for the formation and structure of orientation and ocular dominance columns which we have investigated are summarized in table 1. Models fall into two categories: "Pattern models" whose aim is to achieve a concise description of the observed patterns and "developmental models" which are focussed on the pro- 94


A Novel Reinforcement Model of Birdsong Vocalization Learning

Neural Information Processing Systems

Songbirds learn to imitate a tutor song through auditory and motor learning. We have developed a theoretical framework for song learning that accounts for response properties of neurons that have been observed in many of the nuclei that are involved in song learning. Specifically, we suggest that the anteriorforebrain pathway, which is not needed for song production in the adult but is essential for song acquisition, provides synaptic perturbations and adaptive evaluations for syllable vocalization learning. A computer model based on reinforcement learning was constructed that could replicate a real zebra finch song with 90% accuracy based on a spectrographic measure. The second generation of the birdsong model replicated the tutor song with 96% accuracy.


New Algorithms for 2D and 3D Point Matching: Pose Estimation and Correspondence

Neural Information Processing Systems

A fundamental open problem in computer vision-determining pose and correspondence between two sets of points in spaceis solved with a novel, robust and easily implementable algorithm. The technique works on noisy point sets that may be of unequal sizes and may differ by nonrigid transformations. A 2D variation calculates the pose between point sets related by an affine transformation-translation, rotation, scale and shear. A 3D to 3D variation calculates translation and rotation. An objective describing the problem is derived from Mean field theory. The objective is minimized with clocked (EMlike) dynamics. Experiments with both handwritten and synthetic data provide empirical evidence for the method. 1 Introduction


Learning with Product Units

Neural Information Processing Systems

The TNM staging system has been used since the early 1960's to predict breast cancer patient outcome. In an attempt to increase prognostic accuracy, many putative prognostic factors have been identified. Because the TNM stage model can not accommodate these new factors, the proliferation of factors in breast cancer has lead to clinical confusion. What is required is a new computerized prognostic system that can test putative prognostic factors and integrate the predictive factors with the TNM variables in order to increase prognostic accuracy. Using the area under the curve of the receiver operating characteristic, we compare the accuracy of the following predictive models in terms of five year breast cancer-specific survival: pTNM staging system, principal component analysis, classification and regression trees, logistic regression, cascade correlation neural network, conjugate gradient descent neural, probabilistic neural network, and backpropagation neural network. Several statistical models are significantly more ac- 1064 Harry B. Burke, David B. Rosen, Philip H. Goodman


A Model of the Neural Basis of the Rat's Sense of Direction

Neural Information Processing Systems

In the last decade the outlines of the neural structures subserving the sense of direction have begun to emerge. Several investigations have shed light on the effects of vestibular input and visual input on the head direction representation. In this paper, a model is formulated of the neural mechanisms underlying the head direction system. The model is built out of simple ingredients, depending on nothing more complicated than connectional specificity, attractor dynamics, Hebbian learning, and sigmoidal nonlinearities, but it behaves in a sophisticated way and is consistent with most of the observed properties ofreal head direction cells. In addition it makes a number of predictions that ought to be testable by reasonably straightforward experiments.