Goto

Collaborating Authors

 Country


A Silicon Axon

Neural Information Processing Systems

It is well known that axons are neural processes specialized for transmitting information overrelatively long distances in the nervous system. Impulsive electrical disturbances known as action potentials are normally initiated near the cell body of a neuron when the voltage across the cell membrane crosses a threshold. These pulses are then propagated with a fairly stereotypical shape at a more or less constant velocitydown the length of the axon. Consequently, axons excel at precisely preserving the relative timing of threshold crossing events but do not preserve any of the initial signal shape. Information, then, is presumably encoded in the relative timing of action potentials.


Learning to Play the Game of Chess

Neural Information Processing Systems

This paper presents NeuroChess, a program which learns to play chess from the final outcome of games. NeuroChess learns chess board evaluation functions, represented by artificial neural networks.


Interior Point Implementations of Alternating Minimization Training

Neural Information Processing Systems

AM techniques were first introduced in soft-competitive learning algorithms[l]. Thistraining procedure was later shown to be closely related to Expectation-Maximization algorithms used by the statistical estimation community[2].


Bias, Variance and the Combination of Least Squares Estimators

Neural Information Processing Systems

We consider the effect of combining several least squares estimators on the expected performance of a regression problem. Computing the exact bias and variance curves as a function of the sample size we are able to quantitatively compare the effect of the combination on the bias and variance separately, and thus on the expected error which is the sum of the two. Our exact calculations, demonstrate that the combination of estimators is particularly useful in the case where the data set is small and noisy and the function to be learned is unrealizable. For large data sets the single estimator produces superior results. Finally, we show that by splitting the data set into several independent parts and training each estimator on a different subset, the performance can in some cases be significantly improved.


Optimal Movement Primitives

Neural Information Processing Systems

The theory of Optimal Unsupervised Motor Learning shows how a network can discover a reduced-order controller for an unknown nonlinear system by representing only the most significant modes. Here, I extend the theory to apply to command sequences, so that the most significant components discovered by the network correspond tomotion "primitives". Combinations of these primitives can be used to produce a wide variety of different movements. I demonstrate applications to human handwriting decomposition and synthesis, as well as to the analysis of electrophysiological experiments on movements resulting from stimulation of the frog spinal cord. 1 INTRODUCTION There is much debate within the neuroscience community concerning the internal representationof movement, and current neurophysiological investigations are aimed at uncovering these representations. In this paper, I propose a different approach that attempts to define the optimal internal representation in terms of "movement primitives", and I compare this representation with the observed behavior.


Asymptotics of Gradient-based Neural Network Training Algorithms

Neural Information Processing Systems

We study the asymptotic properties of the sequence of iterates of weight-vector estimates obtained by training a multilayer feedforward neuralnetwork with a basic gradient-descent method using a fixed learning constant and no batch-processing. In the onedimensional case,an exact analysis establishes the existence of a limiting distribution that is not Gaussian in general. For the general caseand small learning constant, a linearization approximation permits the application of results from the theory of random matrices toagain establish the existence of a limiting distribution. We study the first few moments of this distribution to compare and contrast the results of our analysis with those of techniques of stochastic approximation. 1 INTRODUCTION The wide applicability of neural networks to problems in pattern classification and signal processing has been due to the development of efficient gradient-descent algorithms forthe supervised training of multilayer feedforward neural networks with differentiable node functions. A basic version uses a fixed learning constant and updates allweights after each training input is presented (online mode) rather than after the entire training set has been presented (batch mode). The properties of this algorithm as exhibited by the sequence of iterates are not yet well-understood. There are at present two major approaches.


Combining Estimators Using Non-Constant Weighting Functions

Neural Information Processing Systems

Volker Tresp*and Michiaki Taniguchi Siemens AG, Central Research Otto-Hahn-Ring 6 81730 Miinchen, Germany Abstract This paper discusses the linearly weighted combination of estimators inwhich the weighting functions are dependent on the input. We show that the weighting functions can be derived either by evaluating the input dependent variance of each estimator or by estimating how likely it is that a given estimator has seen data in the region of the input space close to the input pattern. The latter solutionis closely related to the mixture of experts approach and we show how learning rules for the mixture of experts can be derived from the theory about learning with missing features. The presented approaches are modular since the weighting functions can easily be modified (no retraining) if more estimators are added. Furthermore,it is easy to incorporate estimators which were not derived from data such as expert systems or algorithms. 1 Introduction Instead of modeling the global dependency between input x E D and output y E using a single estimator, it is often very useful to decompose a complex mapping -'\.t the time of the research for this paper, a visiting researcher at the Center for Biological and Computational Learning, MIT.




Limits on Learning Machine Accuracy Imposed by Data Quality

Neural Information Processing Systems

Random errors and insufficiencies in databases limit the performance ofany classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance ofclassifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.