Goto

Collaborating Authors

 Asia


Bias, Variance and the Combination of Least Squares Estimators

Neural Information Processing Systems

We consider the effect of combining several least squares estimators on the expected performance of a regression problem. Computing the exact bias and variance curves as a function of the sample size we are able to quantitatively compare the effect of the combination on the bias and variance separately, and thus on the expected error which is the sum of the two. Our exact calculations, demonstrate that the combination of estimators is particularly useful in the case where the data set is small and noisy and the function to be learned is unrealizable. For large data sets the single estimator produces superior results. Finally, we show that by splitting the data set into several independent parts and training each estimator on a different subset, the performance can in some cases be significantly improved.


Computational Structure of coordinate transformations: A generalization study

Neural Information Processing Systems

One of the fundamental properties that both neural networks and the central nervous system share is the ability to learn and generalize from examples. While this property has been studied extensively in the neural network literature it has not been thoroughly explored in human perceptual and motor learning. We have chosen a coordinate transformation system-the visuomotor map which transforms visual coordinates into motor coordinates-to study the generalization effects of learning new input-output pairs. Using a paradigm of computer controlled altered visual feedback, we have studied the generalization of the visuomotor map subsequent to both local and context-dependent remappings. A local remapping of one or two input-output pairs induced a significant global, yet decaying, change in the visuomotor map, suggesting a representation for the map composed of units with large functional receptive fields. Our study of context-dependent remappings indicated that a single point in visual space can be mapped to two different finger locations depending on a context variable-the starting point of the movement. Furthermore, as the context is varied there is a gradual shift between the two remappings, consistent with two visuomotor modules being learned and gated smoothly with the context. 1 Introduction The human central nervous system (CNS) receives sensory inputs from a multitude of modalities, each tuned to extract different forms of information from the 1126 Zoubin Ghahramani, Daniel M. Wolpert, Michael 1. Jordan


Interference in Learning Internal Models of Inverse Dynamics in Humans

Neural Information Processing Systems

Experiments were performed to reveal some of the computational properties of the human motor memory system. We show that as humans practice reaching movements while interacting with a novel mechanical environment, they learn an internal model of the inverse dynamics of that environment. The representation of the internal model in memory is such that there is interference when there is an attempt to learn a new inverse dynamics map immediately after an anticorrelated mapping was learned. We suggest that this interference is an indication that the same computational elements used to encode the first inverse dynamics map are being used to learn the second mapping. We predict that this leads to a forgetting of the initially learned skill. 1 Introduction In tasks where we use our hands to interact with a tool, our motor system develops a model of the dynamics of that tool and uses this model to control the coupled dynamics of our arm and the tool (Shadmehr and Mussa-Ivaldi 1994). In physical systems theory, the tool is a mechanical analogue of an admittance, mapping a force as input onto a change in state as output (Hogan 1985).


Adaptive Elastic Input Field for Recognition Improvement

Neural Information Processing Systems

For machines to perform classification tasks, such as speech and character recognition, appropriately handling deformed patterns is a key to achieving high performance. The authors presents a new type of classification system, an Adaptive Input Field Neural Network (AIFNN), which includes a simple pre-trained neural network and an elastic input field attached to an input layer. By using an iterative method, AIFNN can determine an optimal affine translation for an elastic input field to compensate for the original deformations. The convergence of the AIFNN algorithm is shown. AIFNN is applied for handwritten numerals recognition. Consequently, 10.83% of originally misclassified patterns are correctly categorized and total performance is improved, without modifying the neural network.


The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System

Neural Information Processing Systems

This system combines a robust input representation, which preserves the dynamic writing information, with a neural network architecture, a so called Multi-State Time Delay Neural Network (MS-TDNN), which integrates rec.ognition and segmentation in a single framework. Our preprocessing transforms the original coordinate sequence into a (still temporal) sequence offeature vectors, which combine strictly local features, like curvature or writing direction, with a bitmap-like representation of the coordinate's proximity. The MS-TDNN architecture is well suited for handling temporal sequences as provided by this input representation. Our system is tested both on writer dependent and writer independent tasks with vocabulary sizes ranging from 400 up to 20,000 words. For example, on a 20,000 word vocabulary we achieve word recognition rates up to 88.9% (writer dependent) and 84.1 % (writer independent) without using any language models.


A Mixture Model System for Medical and Machine Diagnosis

Neural Information Processing Systems

Diagnosis of human disease or machine fault is a missing data problem since many variables are initially unknown. Additional information needs to be obtained. The j oint probability distribution of the data can be used to solve this problem. We model this with mixture models whose parameters are estimated by the EM algorithm. This gives the benefit that missing data in the database itself can also be handled correctly. The request for new information to refine the diagnosis is performed using the maximum utility principle. Since the system is based on learning it is domain independent and less labor intensive than expert systems or probabilistic networks. An example using a heart disease database is presented.


Nonlinear Image Interpolation using Manifold Learning

Neural Information Processing Systems

The problem of interpolating between specified images in an image sequence is a simple, but important task in model-based vision. We describe an approach based on the abstract task of "manifold learning" and present results on both synthetic and real image sequences. This problem arose in the development of a combined lipreading and speech recognition system.


Unsupervised Classification of 3D Objects from 2D Views

Neural Information Processing Systems

The human visual system can recognize various 3D (three-dimensional) objects from their 2D (two-dimensional) retinal images although the images vary significantly as the viewpoint changes. Recent computational models have explored how to learn to recognize 3D objects from their projected views (Poggio & Edelman, 1990). Most existing models are, however, based on supervised learning, i.e., during training the teacher tells which object each view belongs to. The model proposed by Weinshall et al. (1990) also requires a signal that segregates different objects during training. This paper, on the other hand, discusses unsupervised aspects of 3D object recognition where the system discovers categories by itself.


Associative Decorrelation Dynamics: A Theory of Self-Organization and Optimization in Feedback Networks

Neural Information Processing Systems

This paper outlines a dynamic theory of development and adaptation in neural networks with feedback connections. Given input ensemble, the connections change in strength according to an associative learning rule and approach a stable state where the neuronal outputs are decorrelated. We apply this theory to primary visual cortex and examine the implications of the dynamical decorrelation of the activities of orientation selective cells by the intracortical connections. The theory gives a unified and quantitative explanation of the psychophysical experiments on orientation contrast and orientation adaptation. Using only one parameter, we achieve good agreements between the theoretical predictions and the experimental data.


Hierarchical Mixtures of Experts Methodology Applied to Continuous Speech Recognition

Neural Information Processing Systems

In this paper, we incorporate the Hierarchical Mixtures of Experts (HME) method of probability estimation, developed by Jordan [1], into an HMMbased continuous speech recognition system. The resulting system can be thought of as a continuous-density HMM system, but instead of using gaussian mixtures, the HME system employs a large set of hierarchically organized but relatively small neural networks to perform the probability density estimation. The hierarchical structure is reminiscent of a decision tree except for two important differences: each "expert" or neural net performs a "soft" decision rather than a hard decision, and, unlike ordinary decision trees, the parameters of all the neural nets in the HME are automatically trainable using the EM algorithm. We report results on the ARPA 5,OOO-word and 4O,OOO-word Wall Street Journal corpus using HME models. 1 Introduction Recent research has shown that a continuous-density HMM (CD-HMM) system can outperform a more constrained tied-mixture HMM system for large-vocabulary continuous speech recognition (CSR) when a large amount of training data is available [2]. In other work, the utility of decision trees has been demonstrated in classification problems by using the "divide and conquer" paradigm effectively, where a problem is divided into a hierarchical set of simpler problems.