Asia
Second Order Properties of Error Surfaces: Learning Time and Generalization
LeCun, Yann, Kanter, Ido, Solla, Sara A.
The learning time of a simple neural network model is obtained through an analytic computation of the eigenvalue spectrum for the Hessian matrix, which describes the second order properties of the cost function in the space of coupling coefficients. The form of the eigenvalue distribution suggests new techniques for accelerating the learning process, and provides a theoretical justification for the choice of centered versus biased state variables.
Dynamics of Generalization in Linear Perceptrons
We study the evolution of the generalization ability of a simple linear perceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; the relaxation time is ex (1 - y'a)-2.
Evaluation of Adaptive Mixtures of Competing Experts
Nowlan, Steven J., Hinton, Geoffrey E.
We compare the performance of the modular architecture, composed of competing expert networks, suggested by Jacobs, Jordan, Nowlan and Hinton (1991) to the performance of a single back-propagation network on a complex, but low-dimensional, vowel recognition task. Simulations reveal that this system is capable of uncovering interesting decompositions in a complex task. The type of decomposition is strongly influenced by the nature of the input to the gating network that decides which expert to use for each case. The modular architecture also exhibits consistently better generalization on many variations of the task.
A competitive modular connectionist architecture
Jacobs, Robert A., Jordan, Michael I.
We describe a multi-network, or modular, connectionist architecture that captures that fact that many tasks have structure at a level of granularity intermediate to that assumed by local and global function approximation schemes. The main innovation of the architecture is that it combines associative and competitive learning in order to learn task decompositions. A task decomposition is discovered by forcing the networks comprising the architecture to compete to learn the training patterns. As a result of the competition, different networks learn different training patterns and, thus, learn to partition the input space. The performance of the architecture on a "what" and "where" vision task and on a multi-payload robotics task are presented.
An Attractor Neural Network Model of Recall and Recognition
Ruppin, Eytan, Yeshurun, Yehezkel
This work presents an Attractor Neural Network (ANN) model of Recall and Recognition. It is shown that an ANN model can qualitatively account for a wide range of experimental psychological data pertaining to the these two main aspects of memory access. Certain psychological phenomena are accounted for, including the effects of list-length, wordfrequency, presentation time, context shift, and aging. Thereafter, the probabilities of successful Recall and Recognition are estimated, in order to possibly enable further quantitative examination of the model. 1 Motivation The goal of this paper is to demonstrate that a Hopfield-based [Hop82] ANN model can qualitatively account for a wide range of experimental psychological data pertaining to the two main aspects of memory access, Recall and Recognition. Recall is defined as the ability to retrieve an item from a list of items (words) originally presented during a previous learning phase, given an appropriate cue (cued RecalQ, or spontaneously (free RecalQ. Recognition is defined as the ability to successfully acknowledge that a certain item has or has not appeared in the tutorial list learned before. The main prospects of ANN modeling is that some parameter values, that in former, 'classical' models of memory retrieval (see e.g.
Language Induction by Phase Transition in Dynamical Recognizers
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained" on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network.
Integrated Segmentation and Recognition of Hand-Printed Numerals
Keeler, James D., Rumelhart, David E., Leow, Wee Kheng
Neural network algorithms have proven useful for recognition of individual, segmented characters. However, their recognition accuracy has been limited by the accuracy of the underlying segmentation algorithm. Conventional, rule-based segmentation algorithms encounter difficulty if the characters are touching, broken, or noisy. The problem in these situations is that often one cannot properly segment a character until it is recognized yet one cannot properly recognize a character until it is segmented. We present here a neural network algorithm that simultaneously segments and recognizes in an integrated system. This algorithm has several novel features: it uses a supervised learning algorithm (backpropagation), but is able to take position-independent information as targets and self-organize the activities of the units in a competitive fashion to infer the positional information. We demonstrate this ability with overlapping hand-printed numerals.