Goto

Collaborating Authors

 Country


Dynamics of Generalization in Linear Perceptrons

Neural Information Processing Systems

We study the evolution of the generalization ability of a simple linear perceptron withN inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization abilitymeasured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; therelaxation time is ex (1 - y'a)-2.


A four neuron circuit accounts for change sensitive inhibition in salamander retina

Neural Information Processing Systems

In salamander retina, the response of On-Off ganglion cells to a central flash is reduced by movement in the receptive field surround. Through computer simulation of a 2-D model which takes into account their anatomical and physiological properties, we show that interactions between four neuron types (two bipolar and two amacrine) may be responsible for the generation and lateral conductance of this change sensitive inhibition. The model shows that the four neuron circuit can account for previously observed movement sensitive reductions in ganglion cell sensitivity and allows visualization and prediction of the spatiotemporal pattern of activity in change sensitive retinal cells.



Natural Dolphin Echo Recognition Using an Integrator Gateway Network

Neural Information Processing Systems

We have been studying the performance of a bottlenosed dolphin on a delayed matching-to-sample task to gain insight into the processes and mechanisms that the animal uses during echolocation. The dolphin recognizes targets by emitting natural sonar signals and listening to the echoes that return. This paper describes a novel neural network architecture, called an integrator gateway network, that we have developed toaccount for this performance. The integrator gateway network combines information from multiple echoes to classify targets with about 90% accuracy. In contrast, a standard backpropagation network performed with only about 63% accuracy.


Generalization Properties of Radial Basis Functions

Neural Information Processing Systems

Atkeson Brain and Cognitive Sciences Department and the Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139 Abstract We examine the ability of radial basis functions (RBFs) to generalize. We compare the performance of several types of RBFs. We use the inverse dynamics ofan idealized two-joint arm as a test case. We find that without a proper choice of a norm for the inputs, RBFs have poor generalization properties. A simple global scaling of the input variables greatly improves performance.


The Recurrent Cascade-Correlation Architecture

Neural Information Processing Systems

Scott E. Fahlman School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract Recurrent Cascade-Correlation CRCC) is a recurrent version of the Cascade Correlation learning architecture of FahIman and Lebiere [Fahlman, 1990]. RCC can learn from examples to map a sequence of inputs into a desired sequence of outputs. New hidden units with recurrent connections are added to the network as needed during training. In effect, the network builds up a finite-state machine tailored specifically for the current problem. RCC retains the advantages of Cascade-Correlation: fast learning, good generalization, automatic construction of a near-minimal multi-layered network, and incremental training. Initially the network contains only inputs, output units, and the connections between them.


Interaction Among Ocularity, Retinotopy and On-center/Off-center Pathways During Development

Neural Information Processing Systems

The development of projections from the retinas to the cortex is mathematically analyzed according to the previously proposed thermodynamic formulation of the self-organization of neural networks. Three types of submodality included in the visual afferent pathways are assumed in two models: model (A), in which the ocularity and retinotopy are considered separately, and model (B), in which on-center/off-center pathways are considered in addition to ocularity and retinotopy. Model (A) shows striped ocular dominance spatial patterns and, in ocular dominance histograms, reveals a dip in the binocular bin. Model (B) displays spatially modulated irregular patterns and shows single-peak behavior in the histograms. When we compare the simulated results with the observed results, it is evident that the ocular dominance spatial patterns and histograms for models (A) and (B) agree very closely with those seen in monkeys and cats.


The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning

Neural Information Processing Systems

In this work we describe a new method that adjusts time-delays and the widths of time-windows in artificial neural networks automatically. The input of the units are weighted by a gaussian input-window over time which allows the learning rules for the delays and widths to be derived in the same way as it is used for the weights. Our results on a phoneme classification task compare well with results obtained with the TDNN by Waibel et al., which was manually optimized for the same task.


Can neural networks do better than the Vapnik-Chervonenkis bounds?

Neural Information Processing Systems

These experiments are designed to test whether average generalization performance can surpass the worst-case bounds obtained from formal learning theory using the Vapnik-Chervonenkis dimension (Blumer et al., 1989). We indeed find that, in some cases, the average generalization is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 11m result of the bound. In other cases, we do find the 11m behavior of the VC bound, and in these cases, the numerical prefactor is closely related to prefactor contained in the bound.


Relaxation Networks for Large Supervised Learning Problems

Neural Information Processing Systems

Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.