Scaling and Generalization in Neural Networks: A Case Study
Ahmad, Subutai, Tesauro, Gerald
The issues of scaling and generalization have emerged as key issues in current studies of supervised learning from examples in neural networks. Questions such as how many training patterns and training cycles are needed for a problem of a given size and difficulty, how to represent the inllUh and how to choose useful training exemplars, are of considerable theoretical and practical importance. Several intuitive rules of thumb have been obtained from empirical studies, but as yet there are few rigorous results.In this paper we summarize a study Qf generalization in the simplest possible case-perceptron networks learning linearly separable functions.The task chosen was the majority function (i.e. return a 1 if a majority of the input units are on), a predicate with a number ofuseful properties. We find that many aspects of.generalization in multilayer networks learning large, difficult tasks are reproduced in this simple domain, in which concrete numerical results and even some analytic understanding can be achieved.
A Programmable Analog Neural Computer and Simulator
Mueller, Paul, Spiegel, Jan Van der, Blackman, David, Chiu, Timothy, Clare, Thomas, Dao, Joseph, Donham, Christopher, Hsieh, Tzu-pu, Loinaz, Marc
ABSTRACT This report describes the design of a programmable general purpose analog neural computer and simulator. It is intended primarily for real-world real-time computations such as analysis of visual or acoustical patterns, robotics and the development of special purpose neural nets. The machine is scalable and composed of interconnected modules containing arrays ofneurons, modifiable synapses and switches. It runs entirely in analog mode but connection architecture, synaptic gains and time constants as well as neuron parameters are set digitally. Each neuron has a limited number of inputs and can be connected to any but not all other neurons.
ALVINN: An Autonomous Land Vehicle in a Neural Network
ALVINN (Autonomous Land Vehicle In a Neural Network) is a 3-layer back-propagation network designed for the task of road following. Currently ALVINNtakes images from a camera and a laser range finder as input and produces as output the direction the vehicle should travel in order to follow the road. Training has been conducted using simulated road images. Successful tests on the Carnegie Mellon autonomous navigation test vehicle indicate that the network can effectively follow real roads under certain field conditions. The representation developed to perfOIm the task differs dramatically whenthe networlc is trained under various conditions, suggesting the possibility of a novel adaptive autonomous navigation system capable of tailoring its processing to the conditions at hand.
Cricket Wind Detection
A great deal of interest has recently been focused on theories concerning parallel distributed processing in central nervous systems. In particular, many researchers have become very interested in the structure and function of "computational maps" in sensory systems. As defined in a recent review (Knudsen et al, 1987), a "map" is an array of nerve cells, within which there is a systematic variation in the "tuning" of neighboring cells for a particular parameter. For example, the projection from retina to visual cortex is a relatively simpletopographic map; each cortical hypercolumn itself contains a more complex "computational" map of preferred line orientation representing theangle of tilt of a simple line stimulus. The overall goal of the research in my lab is to determine how a relatively complex mapped sensory system extracts and encodes information from external stimuli.The preparation we study is the cercal sensory system of the cricket, Acheta domesticus.
Speech Production Using A Neural Network with a Cooperative Learning Mechanism
We propose a new neural network model and its learning algorithm. The proposed neural network consists of four layers - input, hidden, output and final output layers. The hidden and output layers are multiple. Using the proposed SICL(Spread Pattern Information and Cooperative Learning) algorithm, it is possible to learn analog data accurately and to obtain smooth outputs. Using this neural network, we have developed a speech production system consisting of a phonemic symbol production subsystem and a speech parameter production subsystem. We have succeeded in producing natural speech waves with high accuracy.
Dynamics of Analog Neural Networks with Time Delay
Marcus, Charles M., Westervelt, R. M.
A time delay in the response of the neurons in a network can induce sustained oscillation and chaos. We present a stability criterion based on local stability analysis to prevent sustained oscillation in symmetric delay networks, and show an example of chaotic dynamics in a non-symmetric delay network.
Optimization by Mean Field Annealing
Bilbro, Griff, Mann, Reinhold, Miller, Thomas K., Snyder, Wesley E., Bout, David E. van den, White, Mark
Nearly optimal solutions to many combinatorial problems can be found using stochastic simulated annealing. This paper extends the concept of simulated annealing from its original formulation as a Markov process to a new formulation based on mean field theory. Mean field annealing essentially replaces the discrete degrees offreedom in simulated annealing with their average values as computed by the mean field approximation. The net result is that equilibrium at a given temperature is achieved 1-2 orders of magnitude faster than with simulated annealing. A general framework forthe mean field annealing algorithm is derived, and its relationship toHopfield networks is shown. The behavior of MFA is examined both analytically and experimentally for a generic combinatorial optimizationproblem: graph bipartitioning. This analysis indicates the presence of critical temperatures which could be important inimproving the performance of neural networks.
Convergence and Pattern-Stabilization in the Boltzmann Machine
The Boltzmann Machine has been introduced as a means to perform global optimization for multimodal objective functions using the principles of simulated annealing. In this paper we consider its utility as a spurious-free content-addressable memory, and provide bounds on its performance in this context. We show how to exploit the machine's ability to escape local minima, in order to use it, at a constant temperature, for unambiguous associative pattern-retrieval in noisy environments. An association rule, which creates a sphere of influence around each stored pattern, is used along with the Machine's dynamics to match the machine's noisy input with one of the pre-stored patterns. Spurious fIxed points, whose regions of attraction are not recognized by the rule, are skipped, due to the Machine's fInite probability to escape from any state. The results apply to the Boltzmann machine and to the asynchronous net of binary threshold elements (Hopfield model'). They provide the network designer with worst-case and best-case bounds for the network's performance, and allow polynomial-time tradeoff studies of design parameters.