Goto

Collaborating Authors

 Industry


Learning with Noise and Regularizers in Multilayer Neural Networks

Neural Information Processing Systems

We study the effect of noise and regularization in an online gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples arerandomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units; the examples arecorrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise and that of weight-decay regularization on the dynamical evolution ofthe order parameters and the generalization error in various phases of the learning process. 1 Introduction One of the most powerful and commonly used methods for training large layered neural networks is that of online learning, whereby the internal network parameters {J} are modified after the presentation of each training example so as to minimize the corresponding error.


Removing Noise in On-Line Search using Adaptive Batch Sizes

Neural Information Processing Systems

Stochastic (online) learning can be faster than batch learning. However, at late times, the learning rate must be annealed to remove thenoise present in the stochastic weight updates. In this annealing phase, the convergence rate (in mean square) is at best proportional to l/T where T is the number of input presentations. An alternative is to increase the batch size to remove the noise. In this paper we explore convergence for LMS using 1) small but fixed batch sizes and 2) an adaptive batch size. We show that the best adaptive batch schedule is exponential and has a rate of convergence whichis the same as for annealing, Le., at best proportional to l/T. 1 Introduction Stochastic (online) learning can speed learning over its batch training particularly,,,,hen data sets are large and contain redundant information [M0l93J. However, at late times in learning, noise present in the weight updates prevents complete convergence fromtaking place. To reduce the noise, the learning rate is slowly decreased (annealed{ at late times. The optimal annealing schedule is asymptotically proportional toT where t is the iteration [GoI87, L093, Orr95J.


Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient

Neural Information Processing Systems

Shun-ichi Amari RIKEN Frontier Research Program, RIKEN, Hirosawa 2-1, Wako-shi 351-01, Japan amari@zoo.riken.go.jp Abstract The parameter space of neural networks has a Riemannian metric structure.The natural Riemannian gradient should be used instead of the conventional gradient, since the former denotes the true steepest descent direction of a loss function in the Riemannian space. The behavior of the stochastic gradient learning algorithm is much more effective if the natural gradient is used. The present paper studies the information-geometrical structure of perceptrons and other networks, and prove that the online learning method based on the natural gradient is asymptotically as efficient as the optimal batch algorithm. Adaptive modification of the learning constant is proposed and analyzed in terms of the Riemannian measure andis shown to be efficient. The natural gradient is finally applied to blind separation of mixtured independent signal sources. 1 Introd uction Neural learning takes place in the parameter space of modifiable synaptic weights of a neural network.


A Model of Recurrent Interactions in Primary Visual Cortex

Neural Information Processing Systems

A general feature of the cerebral cortex is its massive interconnectivity -it has been estimated anatomically [19] that cortical neurons receive upwards of 5,000 synapses, the majority of which originate from other nearby cortical neurons. Numerous experiments inprimary visual cortex (VI) have revealed strongly nonlinear interactions between stimulus elements which activate classical and nonclassical receptive field regions.


Cholinergic Modulation Preserves Spike Timing Under Physiologically Realistic Fluctuating Input

Neural Information Processing Systems

Recently, there has been a vigorous debate concerning the nature of neural coding (Rieke et al. 1996; Stevens and Zador 1995; Shadlen and Newsome 1994). The prevailing viewhas been that the mean firing rate conveys all information about the sensory stimulus in a spike train and the precise timing of the individual spikes is noise. This belief is, in part, based on a lack of correlation between the precise timing ofthe spikes and the sensory qualities of the stimulus under study, particularly, on a lack of spike timing repeatability when identical stimulation is delivered. This view has been challenged by a number of recent studies, in which highly repeatable temporal patterns of spikes can be observed both in vivo (Bair and Koch 1996; Abeles et al. 1993) and in vitro (Mainen and Sejnowski 1994). Furthermore, application ofinformation theory to the coding problem in the frog and house fly (Bialek et al. 1991; Bialek and Rieke 1992) suggested that additional information could be extracted from spike timing. In the absence of direct evidence for a timing code in the cerebral cortex, the role of spike timing in neural coding remains controversial.


Learning Exact Patterns of Quasi-synchronization among Spiking Neurons from Data on Multi-unit Recordings

Neural Information Processing Systems

This paper develops arguments for a family of temporal log-linear models to represent spatiotemporal correlations among the spiking events in a group of neurons. The models can represent not just pairwise correlations but also correlations of higher order. Methods are discussed for inferring the existence or absence of correlations and estimating their strength. A frequentist and a Bayesian approach to correlation detection are compared.


A Neural Model of Visual Contour Integration

Neural Information Processing Systems

Sometimes local features group into regions, as in texture segmentation; at other times they group into contours which may represent object boundaries. Although much is known about the processing steps that extract local features such as oriented input edges, it is still unclear how local features are grouped into global ones more meaningful for objects.


Extraction of Temporal Features in the Electrosensory System of Weakly Electric Fish

Neural Information Processing Systems

The weakly electric fish, Eigenmannia, generates a quasi sinusoidal, dipole-like electric fieldat individually fixed frequencies (250 - 600 Hz) by discharging an electric organ located in its tail (see Bullock and Heilgenberg, 1986 for reviews).


Neural Network Models of Chemotaxis in the Nematode Caenorhabditis Elegans

Neural Information Processing Systems

Thomas C. Ferree, Ben A. Marcotte, Shawn R. Lockery Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403 Abstract We train recurrent networks to control chemotaxis in a computer model of the nematode C. elegans. The model presented is based closely on the body mechanics, behavioral analyses, neuroanatomy and neurophysiology of C. elegans, each imposing constraints relevant forinformation processing. Simulated worms moving autonomously insimulated chemical environments display a variety of chemotaxis strategies similar to those of biological worms. 1 INTRODUCTION The nematode C. elegans provides a unique opportunity to study the neuronal basis ofneural computation in an animal capable of complex goal-oriented behaviors. The adult hermaphrodite is only 1 mm long, and has exactly 302 neurons and 95 muscle cells. The morphology of every cell and the location of most electrical and chemical synapses are known precisely (White et al., 1986), making C. elegans especially attractivefor study. Whole-cell recordings are now being made on identified neurons in the nerve ring of C. elegans to determine electrophysiological properties which underly information processing in this animal (Lockery and Goodman, unpublished).


A Hierarchical Model of Visual Rivalry

Neural Information Processing Systems

Binocular rivalry is the alternating percept that can result when the two eyes see different scenes. Recent psychophysical evidence supports an account for one component of binocular rivalry similar to that for other bistable percepts. Recent neurophysiological evidence showsthat some binocular neurons are modulated with the changing percept; others are not, even if they are selective between thestimuli presented to the eyes. We extend our model to a hierarchy to address these effects. 1 Introduction Although binocular rivalry leads to distinct perceptual distress, it is revealing about the mechanisms of visual information processing. Various experiments have suggested that simple input competition cannot be the whole story. This work was supported by the NIH.