Goto

Collaborating Authors

 Genre


Adaptive Back-Propagation in On-Line Learning of Multilayer Networks

Neural Information Processing Systems

This research has been motivated by the dominance of the suboptimal symmetric phase in online learning of two-layer feedforward networks trained by gradient descent [2]. This trapping is emphasized for inappropriate small learning rates but exists in all training scenarios, effecting the learning process considerably. We Adaptive Back-Propagation in Online Learning of Multilayer Networks 329 proposed an adaptive back-propagation training algorithm [Eq.


Optimization Principles for the Neural Code

Neural Information Processing Systems

Recent experiments show that the neural codes at work in a wide range of creatures share some common features. At first sight, these observations seem unrelated. However, we show that these features arise naturally in a linear filtered threshold crossing (LFTC) model when we set the threshold to maximize the transmitted information. This maximization process requires neural adaptation to not only the DC signal level, as in conventional light and dark adaptation, but also to the statistical structure of the signal and noise distributions. We also present a new approach for calculating the mutual information between a neuron's output spike train and any aspect of its input signal which does not require reconstruction of the input signal. This formulation is valid provided the correlations in the spike train are small, and we provide a procedure for checking this assumption. This paper is based on joint work (DeWeese [1], 1995). Preliminary results from the LFTC model appeared in a previous proceedings (DeWeese [2], 1995), and the conclusions we reached at that time have been reaffirmed by further analysis of the model.


Information through a Spiking Neuron

Neural Information Processing Systems

While it is generally agreed that neurons transmit information about their synaptic inputs through spike trains, the code by which this information is transmitted is not well understood. An upper bound on the information encoded is obtained by hypothesizing that the precise timing of each spike conveys information. Here we develop a general approach to quantifying the information carried by spike trains under this hypothesis, and apply it to the leaky integrate-and-fire (IF) model of neuronal dynamics. We formulate the problem in terms of the probability distribution peT) of interspike intervals (ISIs), assuming that spikes are detected with arbitrary but finite temporal resolution. In the absence of added noise, all the variability in the ISIs could encode information, and the information rate is simply the entropy of the lSI distribution, H (T) (-p(T) log2 p(T)}, times the spike rate. H (T) thus provides an exact expression for the information rate. The methods developed here can be used to determine experimentally the information carried by spike trains, even when the lower bound of the information rate provided by the stimulus reconstruction method is not tight. In a preliminary series of experiments, we have used these methods to estimate information rates of hippocampal neurons in slice in response to somatic current injection. These pilot experiments suggest information rates as high as 6.3 bits/spike.


Modeling Interactions of the Rat's Place and Head Direction Systems

Neural Information Processing Systems

We have developed a computational theory of rodent navigation that includes analogs of the place cell system, the head direction system, and path integration. In this paper we present simulation results showing how interactions between the place and head direction systems can account for recent observations about hippocampal place cell responses to doubling and/or rotation of cue cards in a cylindrical arena (Sharp et at.,


A Model of Auditory Streaming

Neural Information Processing Systems

The formation of associations between signals, which are considered to arise from the same external source, allows the organism to recognise significant patterns and relationships within the signals from each source without being confused by accidental coincidences between unrelated signals (Bregman, 1990). The intrinsically temporal nature of sound means that in addition to being able to focus on the signal of interest, perhaps of equal significance, is the ability to predict how that signal is expected to progress; such expectations can then be used to facilitate further processing of the signal. It is important to remember that perception is a creative act (Luria, 1980). The organism creates its interpretation of the world in response to the current stimuli, within the context of its current state of alertness, attention, and previous experience. The creative aspects of perception are exemplified in the auditory system where peripheral processing decomposes acoustic stimuli.


Human Reading and the Curse of Dimensionality

Neural Information Processing Systems

Whereas optical character recognition (OCR) systems learn to classify single characters; people learn to classify long character strings in parallel, within a single fixation. This difference is surprising because high dimensionality is associated with poor classification learning. This paper suggests that the human reading system avoids these problems because the number of to-be-classified images is reduced by consistent and optimal eye fixation positions, and by character sequence regularities. An interesting difference exists between human reading and optical character recognition (OCR) systems. The input/output dimensionality of character classification in human reading is much greater than that for OCR systems (see Figure 1). OCR systems classify one character at time; while the human reading system classifies as many as 8-13 characters per eye fixation (Rayner, 1979) and within a fixation, character category and sequence information is extracted in parallel (Blanchard, McConkie, Zola, and Wolverton, 1984; Reicher, 1969).


Memory-based Stochastic Optimization

Neural Information Processing Systems

In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global nonlinear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model answers queries about confidence, noise, gradient and Hessians, and use them to make automated decisions similar to those made by a practitioner of Response Surface Methodology. The global and local models are combined naturally as a locally weighted regression. We examine the question of whether the global model can really help optimization, and we extend it to the case of time-varying functions. We compare the new algorithms with a highly tuned higher-order stochastic optimization algorithm on randomly-generated functions and a simulated manufacturing task. We note significant improvements in total regret, time to converge, and final solution quality. 1 INTRODUCTION In a stochastic optimization problem, noisy samples are taken from a plant. A sample consists of a chosen control u (a vector ofreal numbers) and a noisy observed response y.


A Dynamical Systems Approach for a Learnable Autonomous Robot

Neural Information Processing Systems

This paper discusses how a robot can learn goal-directed navigation tasks using local sensory inputs. The emphasis is that such learning tasks could be formulated as an embedding problem of dynamical systems: desired trajectories in a task space should be embedded into an adequate sensory-based internal state space so that an unique mapping from the internal state space to the motor command could be established. The paper shows that a recurrent neural network suffices in self-organizing such an adequate internal state space from the temporal sensory input.


Stock Selection via Nonlinear Multi-Factor Models

Neural Information Processing Systems

This paper discusses the use of multilayer feed forward neural networks for predicting a stock's excess return based on its exposure to various technical and fundamental factors. To demonstrate the effectiveness of the approach a hedged portfolio which consists of equally capitalized long and short positions is constructed and its historical returns are benchmarked against T-bill returns and the S&P500 index. 1 Introduction


Prediction of Beta Sheets in Proteins

Neural Information Processing Systems

Most current methods for prediction of protein secondary structure use a small window of the protein sequence to predict the structure of the central amino acid. We describe a new method for prediction of the non-local structure called,8-sheet, which consists of two or more,8-strands that are connected by hydrogen bonds. Since,8-strands are often widely separated in the protein chain, a network with two windows is introduced. After training on a set of proteins the network predicts the sheets well, but there are many false positives. By using a global energy function the,8-sheet prediction is combined with a local prediction of the three secondary structures a-helix,,8-strand and coil.