Goto

Collaborating Authors

 Europe


Stochastic Neurodynamics

Neural Information Processing Systems

The main point of this paper is that stochastic neural networks have a mathematical structure that corresponds quite closely with that of quantum field theory. Neural network Liouvillians and Lagrangians can be derived, just as can spin Hamiltonians and Lagrangians in QFf. It remains to show the efficacy of such a description.


Development and Spatial Structure of Cortical Feature Maps: A Model Study

Neural Information Processing Systems

Feature selective cells in the primary visual cortex of several species are organized in hierarchical topographic maps of stimulus features like "position in visual space", "orientation" and" ocular dominance". In order to understand and describe their spatial structure and their development, we investigate a self-organizing neural network model based on the feature map algorithm. The model explains map formation as a dimension-reducing mapping from a high-dimensional feature space onto a two-dimensional lattice, such that "similarity" between features (or feature combinations) is translated into "spatial proximity" between the corresponding feature selective cells. The model is able to reproduce several aspects of the spatial structure of cortical maps in the visual cortex. 1 Introduction Cortical maps are functionally defined structures of the cortex, which are characterized by an ordered spatial distribution of functionally specialized cells along the cortical surface. In the primary visual area(s) the response properties of these cells must be described by several independent features, and there is a strong tendency to map combinations of these features onto the cortical surface in a way that translates "similarity" into "spatial proximity" of the corresponding feature selective cells (see e.g.


Navigating through Temporal Difference

Neural Information Processing Systems

Barto, Sutton and Watkins [2] introduced a grid task as a didactic example of temporal difference planning and asynchronous dynamical pre gramming. This paper considers the effects of changing the coding of the input stimulus, and demonstrates that the self-supervised learning of a particular form of hidden unit representation improves performance.


Design and Implementation of a High Speed CMAC Neural Network Using Programmable CMOS Logic Cell Arrays

Neural Information Processing Systems

A high speed implementation of the CMAC neural network was designed using dedicated CMOS logic. This technology was then used to implement two general purpose CMAC associative memory boards for the VME bus. Each board implements up to 8 independent CMAC networks with a total of one million adjustable weights. Each CMAC network can be configured to have from 1 to 512 integer inputs and from 1 to 8 integer outputs. Response times for typical CMAC networks are well below 1 millisecond, making the networks sufficiently fast for most robot control problems, and many pattern recognition and signal processing problems.


Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Neural Information Processing Systems

The performance of seven minimization algorithms are compared on five neural network problems. These include a variable-step-size algorithm, conjugate gradient, and several methods with explicit analytic or numerical approximations to the Hessian.



Dynamics of Generalization in Linear Perceptrons

Neural Information Processing Systems

We study the evolution of the generalization ability of a simple linear perceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; the relaxation time is ex (1 - y'a)-2.


Generalization by Weight-Elimination with Application to Forecasting

Neural Information Processing Systems

Inspired by the information theoretic idea of minimum description length, we add a term to the back propagation cost function that penalizes network complexity. We give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. We use this procedure to predict the sunspot time series and the notoriously noisy series of currency exchange rates. 1 INTRODUCTION Learning procedures for connectionist networks are essentially statistical devices for performing inductive inference. There is a tradeoff between two goals: on the one hand, we want such devices to be as general as possible so that they are able to learn a broad range of problems.


On Stochastic Complexity and Admissible Models for Neural Network Classifiers

Neural Information Processing Systems

For a detailed rationale the reader is referred to the work of Rissanen (1984) or Wallace and Freeman (1987) and the references therein. Note that the Minimum Description Length (MDL) technique (as Rissanen's approach has become known) is implicitly related to Maximum A Posteriori (MAP) Bayesian estimation techniques if cast in the appropriate framework.


Evolution and Learning in Neural Networks: The Number and Distribution of Learning Trials Affect the Rate of Evolution

Neural Information Processing Systems

Learning can increase the rate of evolution of a population of biological organisms (the Baldwin effect). Our simulations show that in a population of artificial neural networks solving a pattern recognition problem, no learning or too much learning leads to slow evolution of the genes whereas an intermediate amount is optimal. Moreover, for a given total number of training presentations, fastest evoution occurs if different individuals within each generation receive different numbers of presentations, rather than equal numbers. Because genetic algorithms (GAs) help avoid local minima in energy functions, our hybrid learning-GA systems can be applied successfully to complex, highdimensional pattern recognition problems. INTRODUCTION The structure and function of a biological network derives from both its evolutionary precursors and real-time learning.