Technology
Experimental Evaluation of Learning in a Neural Microsystem
Alspector, Joshua, Jayakumar, Anthony, Luna, Stephan
Joshua Alspector Anthony Jayakumar Stephan Lunat Bellcore Morristown, NJ 07962-1910 Abstract We report learning measurements from a system composed of a cascadable learning chip, data generators and analyzers for training pattern presentation, and an X-windows based software interface. The 32 neuron learning chip has 496 adaptive synapses and can perform Boltzmann and mean-field learning using separate noise and gain controls.
Optical Implementation of a Self-Organizing Feature Extractor
Anderson, Dana Z., Benkert, Claus, Hebler, Verena, Jang, Ju-Seog, Montgomery, Don, Saffman, Mark
We demonstrate a self-organizing system based on photorefractive ringoscillators. We employ the system in two ways that can both be thought of as feature extractors; one acts on a set of images exposed repeatedly to the system strictly as a linear feature extractor, and the other serves as a signal demultiplexer forfiber optic communications. Both systems implement unsupervised competitive learning embedded within the mode interaction dynamics between the modes of a set of ring oscillators. Aftera training period, the modes of the rings become associated withthe different image features or carrier frequencies within the incoming data stream.
Induction of Finite-State Automata Using Second-Order Recurrent Networks
Watrous, Raymond L., Kuhn, Gary M.
By a method of heuristic search over the space of finite state automata with up to eight states, he was able to induce a recognizer for each of these languages (Tomita, 1982). Recognizers of finite-state languages have also been induced using first-order recurrent connectionistnetworks (Elman, 1990; Williams and Zipser, 1988; Cleeremans, Servan-Schreiber and McClelland, 1989). Generally speaking, these results were obtained by training the network to predict the next symbol (Cleeremans, Servan-Schreiber and McClelland, 1989; Williams and Zipser, 1988), rather than by training the network to accept or reject strings of different .lengths. Several training algorithms used an approximation to the gradient (Elman, 1990; Cleeremans, Servan-Schreiberand McClelland, 1989) by truncating the computation of the backward recurrence. The problem of inducing languages from examples has also been approached using second-order recurrent networks (Pollack, 1990; Giles et al., 1990). Using a truncated approximationto the gradient, and Tomita's training sets, Pollack reported that "none of the ideal languages were induced" (Pollack, 1990). On the other hand, a Tomita language has been induced using the complete gradient (Giles et al., 1991). This paper reports the induction of several Tomita languages and the extraction of the corresponding automata with certain differences in method from (Giles et al., 1991).
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation
Bengio, Yoshua, Mori, Renato De, Flammia, Giovanni, Kompe, Ralf
The subject of this paper is the integration of multi-layered Artificial Neural Networks(ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs. 1 INTRODUCTION This paper studies the integration of Artificial Neural Networks (ANN) with probability densityfunctions (pdf) such as the Gaussian mixtures often used in continuous density Hidden Markov Models. The ANNs considered here are multi-layered or recurrent networks with hyperbolic tangent hidden units.
Locomotion in a Lower Vertebrate: Studies of the Cellular Basis of Rhythmogenesis and Oscillator Coupling
To test whether the known connectivies of neurons in the lamprey spinal cord are sufficient to account for locomotor rhythmogenesis, a CCconnectionist" neuralnetwork simulation was done using identical cells connected according toexperimentally established patterns. It was demonstrated that the network oscillates in a stable manner with the same phase relationships amongthe neurons as observed in the lamprey. The model was then used to explore coupling between identical?scillators. It was concluded that the neurons can have a dual role as rhythm generators and as coordinators betweenoscillators to produce the phase relations observed among segmental oscillators during swimming.
Information Processing to Create Eye Movements
Because eye muscles never cocontract and do not deal with external loads, one can write an equation that relates motoneuron firing rate to eye position and velocity - a very uncommon situation in the CNS. The semicircular canals transduce head velocity in a linear manner by using a high background discharge rate, imparting linearity to the premotor circuits that generate eye movements. This has allowed deducing some of the signal processing involved, including a neural network that integrates. These ideas are often summarized by block diagrams. Unfortunately, they are of little value in describing the behavior of single neurons - a fmding supported by neural network models.
Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction
The notion of generalization ability can be defined precisely as the prediction risk,the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. Wealso propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.
Kernel Regression and Backpropagation Training With Noise
Koistinen, Petri, Holmström, Lasse
One method proposed for improving the generalization capability of a feedforward networktrained with the backpropagation algorithm is to use artificial training vectors which are obtained by adding noise to the original trainingvectors. We discuss the connection of such backpropagation training with noise to kernel density and kernel regression estimation. We compare by simulated examples (1) backpropagation, (2) backpropagation with noise, and (3) kernel regression in mapping estimation and pattern classification contexts.