Industry
Biologically Plausible Local Learning Rules for the Adaptation of the Vestibulo-Ocular Reflex
Coenen, Olivier, Sejnowski, Terrence J., Lisberger, Stephen G.
The vestibulo-ocular reflex (VOR) is a compensatory eye movement that stabilizes images on the retina during head turns. Its magnitude, or gain, can be modified by visual experience during head movements. Possible learning mechanisms for this adaptation have been explored in a model of the oculomotor system based on anatomical and physiological constraints. The local correlational learning rules in our model reproduce the adaptation and behavior of the VOR under certain parameter conditions. From these conditions, predictions for the time course of adaptation at the learning sites are made. 1 INTRODUCTION The primate oculomotor system is capable of maintaining the image of an object on the fovea even when the head and object are moving simultaneously.
A Neural Network that Learns to Interpret Myocardial Planar Thallium Scintigrams
Rosenberg, Charles, Erel, Jacob, Atlan, Henri
The planar thallium-201 myocardial perfusion scintigram is a widely used diagnostic technique for detecting and estimating the risk of coronary artery disease. Neural networks learned to interpret 100 thallium scintigrams asdetermined by individual expert ratings. Standard error backpropagation wascompared to standard LMS, and LMS combined with one layer of RBF units.
A Model of Feedback to the Lateral Geniculate Nucleus
Simplified models of the lateral geniculate nucles (LGN) and striate cortexillustrate the possibility that feedback to the LG N may be used for robust, low-level pattern analysis. The information fed back to the LG N is rebroadcast to cortex using the LG N's full fan-out, so the cortex-LGN-cortex pathway mediates extensive cortico-cortical communication while keeping the number of necessary connectionssmall. 1 INTRODUCTION The lateral geniculate nucleus (LGN) in the thalamus is often considered as just a relay station on the way from the retina to visual cortex, since receptive field properties ofneurons in the LGN are very similar to retinal ganglion cell receptive field properties. However, there is a massive projection from cortex back to the LGN: it is estimated that 3-4 times more synapses in the LG N are due to corticogeniculate connectionsthan those due to retinogeniculate connections [12]. This suggests some important processing role for the LGN, but the nature of the computation performed has remained far from clear. I will first briefly summarize some anatomical facts and physiological results concerning thecorticogeniculate loop, and then present a simplified model in which its function is to (usefully) mediate communication between cortical cells.
Word Space
Representations for semantic information about words are necessary formany applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number ofwords (50,000) from lexical coccurrence statistics by means of a large-scale linear regression. The representations are successfully appliedto word sense disambiguation using a nearest neighbor method. 1 Introduction Many tasks in natural language processing require access to semantic information about lexical items and text segments.
Second order derivatives for network pruning: Optimal Brain Surgeon
Hassibi, Babak, Stork, David G.
We investigate the use of information from all second order derivatives of the error function to perfonn network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix HI from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weighL decay on three benchmark MONK's problems [Thrun et aI., 1991]. Of OBS, Optimal Brain Damage, and magnitude-based methods, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg [1987J used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1560 weights, yielding better generalization.
Interposing an ontogenetic model between Genetic Algorithms and Neural Networks
The relationships between learning, development and evolution in Nature is taken seriously, to suggest a model of the developmental process whereby the genotypes manipulated by the Genetic Algorithm (GA)might be expressed to form phenotypic neural networks (NNet) that then go on to learn. ONTOL is a grammar for generating polynomialNNets for time-series prediction. Genomes correspond toan ordered sequence of ONTOL productions and define a grammar that is expressed to generate a NNet. The NNet's weights are then modified by learning, and the individual's prediction error is used to determine GA fitness. A new gene doubling operator appears critical to the formation of new genetic alternatives in the preliminary but encouraging results presented.
Efficient Pattern Recognition Using a New Transformation Distance
Simard, Patrice, LeCun, Yann, Denker, John S.
Memory-based classification algorithms such as radial basis functions orK-nearest neighbors typically rely on simple distances (Euclidean, dotproduct ...), which are not particularly meaningful on pattern vectors. More complex, better suited distance measures are often expensive and rather ad-hoc (elastic matching, deformable templates). We propose a new distance measure which (a) can be made locally invariant to any set of transformations of the input and (b) can be computed efficiently. We tested the method on large handwritten character databases provided by the Post Office and the NIST. Using invariances with respect to translation, rotation, scaling,shearing and line thickness, the method consistently outperformed all other systems tested on the same databases.
Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors
LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak
Inst., 19600 NW vonNeumann Dr, Beaverton, OR 97006 Abstract We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian),which does not require to even calculate the Hessian. Severalother applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive partof neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.
Input Reconstruction Reliability Estimation
This paper describes a technique called Input Reconstruction Reliability Estimation (IRRE) for determining the response reliability of a restricted class of multi-layer perceptrons (MLPs). The technique uses a network's ability to accurately encode the input pattern in its internal representation as a measure of its reliability. The more accurately a network is able to reconstruct the input pattern from its internal representation, the more reliable the network is considered to be. IRRE is provides a good estimate of the reliability of MLPs trained for autonomous driving. Results are presented in which the reliability estimates provided by IRRE are used to select between networks trained for different driving situations. 1 Introduction In many real world domains it is important to know the reliability of a network's response since a single network cannot be expected to accurately handle all the possible inputs.