Country
Experimental Evaluation of Learning in a Neural Microsystem
Alspector, Joshua, Jayakumar, Anthony, Luna, Stephan
Joshua Alspector Anthony Jayakumar Stephan Lunat Bellcore Morristown, NJ 07962-1910 Abstract We report learning measurements from a system composed of a cascadable learning chip, data generators and analyzers for training pattern presentation, and an X-windows based software interface. The 32 neuron learning chip has 496 adaptive synapses and can perform Boltzmann and mean-field learning using separate noise and gain controls.
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation
Bengio, Yoshua, Mori, Renato De, Flammia, Giovanni, Kompe, Ralf
The subject of this paper is the integration of multi-layered Artificial Neural Networks(ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs. 1 INTRODUCTION This paper studies the integration of Artificial Neural Networks (ANN) with probability densityfunctions (pdf) such as the Gaussian mixtures often used in continuous density Hidden Markov Models. The ANNs considered here are multi-layered or recurrent networks with hyperbolic tangent hidden units.
Information Processing to Create Eye Movements
Because eye muscles never cocontract and do not deal with external loads, one can write an equation that relates motoneuron firing rate to eye position and velocity - a very uncommon situation in the CNS. The semicircular canals transduce head velocity in a linear manner by using a high background discharge rate, imparting linearity to the premotor circuits that generate eye movements. This has allowed deducing some of the signal processing involved, including a neural network that integrates. These ideas are often summarized by block diagrams. Unfortunately, they are of little value in describing the behavior of single neurons - a fmding supported by neural network models.
Generalization Performance in PARSEC - A Structured Connectionist Parsing Architecture
This paper presents PARSECa system for generating connectionist parsing networks from example parses. PARSEC is not based on formal grammar systems and is geared toward spoken language tasks. PARSEC networks exhibit three strengths important for application to speech processing: 1)they learn to parse, and generalize well compared to handcoded grammars; 2) they tolerate several types of noise; 3) they can learn to use multi-modal input. Presented are the PARSEC architecture and performance analyses along several dimensions that demonstrate PARSEC's features. PARSEC's performance is compared to that of traditional grammar-basedparsing systems. 1 INTRODUCTION While a great deal of research has been done developing parsers for natural language, adequate solutionsfor some of the particular problems involved in spoken language have not been found. Among the unsolved problems are the difficulty in constructing task-specific grammars, lack of tolerance to noisy input, and inability to effectively utilize non-symbolic information.This paper describes PARSECa system for generating connectionist parsing networks from example parses.
Combined Neural Network and Rule-Based Framework for Probabilistic Pattern Recognition and Discovery
Greenspan, Hayit K., Goodman, Rodney, Chellappa, Rama
A combined neural network and rule-based approach is suggested as a general framework for pattern recognition. This approach enables unsupervised andsupervised learning, respectively, while providing probability estimates for the output classes. The probability maps are utilized for higher level analysis such as a feedback for smoothing over the output label mapsand the identification of unknown patterns (pattern "discovery"). The suggested approach is presented and demonstrated in the texture - analysis task. A correct classification rate in the 90 percentile is achieved for both unstructured and structured natural texture mosaics. The advantages ofthe probabilistic approach to pattern analysis are demonstrated.
Shooting Craps in Search of an Optimal Strategy for Training Connectionist Pattern Classifiers
II, J. B. Hampshire, Kumar, B. V. K. Vijaya
We compare two strategies for training connectionist (as well as nonconnectionist) modelsfor statistical pattern recognition. The probabilistic strategy is based on the notion that Bayesian discrimination (i.e.- optimal classification) isachieved when the classifier learns the a posteriori class distributions of the random feature vector. The differential strategy is based on the notion that the identity of the largest class a posteriori probability of the feature vector is all that is needed to achieve Bayesian discrimination. Each strategy is directly linked to a family ofobjective functions that can be used in the supervised training procedure. We prove that the probabilistic strategy - linked with error measure objective functions such as mean-squared-error and cross-entropy - typically used to train classifiers necessarily requires larger training sets and more complex classifier architectures than those needed to approximate the Bayesian discriminant function.In contrast.
Tangent Prop - A formalism for specifying selected invariances in an adaptive network
Simard, Patrice, Victorri, Bernard, LeCun, Yann, Denker, John
In many machine learning applications, one has access, not only to training data, but also to some high-level a priori knowledge about the desired behavior ofthe system. For example, it is known in advance that the output of a character recognizer should be invariant with respect to small spatial distortionsof the input images (translations, rotations, scale changes, etcetera). We have implemented a scheme that allows a network to learn the derivative ofits outputs with respect to distortion operators of our choosing. This not only reduces the learning time and the amount of training data, but also provides a powerful language for specifying what generalizations we wish the network to perform. 1 INTRODUCTION In machine learning, one very often knows more about the function to be learned than just the training data. An interesting case is when certain directional derivatives ofthe desired function are known at certain points.
A Neural Net Model for Adaptive Control of Saccadic Accuracy by Primate Cerebellum and Brainstem
Dean, Paul, Mayhew, John E. W., Langdon, Pat
Accurate saccades require interaction between brainstem circuitry and the cerebeJJum. A model of this interaction is described, based on Kawato's principle of feedback-error-Iearning. In the model a part of the brainstem (the superior colliculus) acts as a simple feedback controJJer with no knowledge of initial eye position, and provides an error signal for the cerebeJJum to correct for eye-muscle nonIinearities. This teaches the cerebeJJum, modelled as a CMAC, to adjust appropriately the gain on the brainstem burst-generator's internal feedback loop and so alter the size of burst sent to the motoneurons. With direction-only errors the system rapidly learns to make accurate horizontal eye movements from any starting position, and adapts realistically to subsequent simulated eye-muscle weakening or displacement of the saccadic target.
A Topographic Product for the Optimization of Self-Organizing Feature Maps
Bauer, Hans-Ulrich, Pawelzik, Klaus, Geisel, Theo
We present a topographic product which measures the preservation of neighborhood relations as a criterion to optimize the output space topology of the map with regard to the global dimensionality DA as well as to the dimensions inthe individual directions. We test the topographic product method not only on synthetic mapping examples, but also on speech data.