Europe
Decoding of Neuronal Signals in Visual Pattern Recognition
Eskandar, Emad N., Richmond, Barry J., Hertz, John A., Optican, Lance M., Kjær, Troels W.
We have investigated the properties of neurons in inferior temporal (IT) cortex in monkeys performing a pattern matching task. Simple backpropagation networks were trained to discriminate the various stimulus conditions on the basis of the measured neuronal signal. We also trained networks to predict the neuronal response waveforms from the spatial patterns of the stimuli. The results indicate t.hat IT neurons convey temporally encoded information about both current and remembered patterns, as well as about their behavioral context.
Green's Function Method for Fast On-Line Learning Algorithm of Recurrent Neural Networks
Sun, Guo-Zheng, Chen, Hsing-Hen, Lee, Yee-Chun
The two well known learning algorithms of recurrent neural networks are the back-propagation (Rumelhart & el al., Werbos) and the forward propagation (Williams and Zipser). The main drawback of back-propagation is its off-line backward path in time for error cumulation. This violates the online requirement in many practical applications. Although the forward propagation algorithm can be used in an online manner, the annoying drawback is the heavy computation load required to update the high dimensional sensitivity matrix (0( fir) operations for each time step). Therefore, to develop a fast forward algorithm is a challenging task.
Learning Unambiguous Reduced Sequence Descriptions
Do you want your neural net algorithm to learn sequences? Do not limit yourself to conventional gradient descent (or approximations thereof). Instead, use your sequence learning algorithm (any will do) to implement the following method for history compression. No matter what your final goals are, train a network to predict its next input from the previous ones. Since only unpredictable inputs convey new information, ignore all predictable inputs but let all unexpected inputs (plus information about the time step at which they occurred) become inputs to a higher-level network of the same kind (working on a slower, self-adjusting time scale). Go on building a hierarchy of such networks.
The Efficient Learning of Multiple Task Sequences
I present a modular network architecture and a learning algorithm based on incremental dynamic programming that allows a single learning agent to learn to solve multiple Markovian decision tasks (MDTs) with significant transfer of learning across the tasks. I consider a class of MDTs, called composite tasks, formed by temporally concatenating a number of simpler, elemental MDTs. The architecture is trained on a set of composite and elemental MDTs. The temporal structure of a composite task is assumed to be unknown and the architecture learns to produce a temporal decomposition. It is shown that under certain conditions the solution of a composite MDT can be constructed by computationally inexpensive modifications of the solutions of its constituent elemental MDTs. 1 INTRODUCTION Most applications of domain independent learning algorithms have focussed on learning single tasks. Building more sophisticated learning agents that operate in complex environments will require handling multiple tasks/goals (Singh, 1992). Research effort on the scaling problem has concentrated on discovering faster learning algorithms, and while that will certainly help, techniques that allow transfer of learning across tasks will be indispensable for building autonomous learning agents that have to learn to solve multiple tasks. In this paper I consider a learning agent that interacts with an external, finite-state, discrete-time, stochastic dynamical environment and faces multiple sequences of Markovian decision tasks (MDTs).
A Connectionist Learning Approach to Analyzing Linguistic Stress
Gupta, Prahlad, Touretzky, David S.
We use connectionist modeling to develop an analysis of stress systems in terms of ease of learnability. In traditional linguistic analyses, learnability arguments determine default parameter settings based on the feasibilty of logicall y deducing correct settings from an initial state. Our approach provides an empirical alternative to such arguments. Based on perceptron learning experiments using data from nineteen human languages, we develop a novel characterization of stress patterns in terms of six parameters. These provide both a partial description of the stress pattern itself and a prediction of its learnability, without invoking abstract theoretical constructs such as metrical feet. This work demonstrates that machine learning methods can provide a fresh approach to understanding linguistic phenomena.
JANUS: Speech-to-Speech Translation Using Connectionist and Non-Connectionist Techniques
Waibel, Alex, Jain, Ajay N., McNair, Arthur E., Tebelskis, Joe, Osterholtz, Louise, Saito, Hiroaki, Schmidbauer, Otto, Sloboda, Tilo, Woszczyna, Monika
JANUS translates continuously spoken English and German into German, English, and Japanese. JANUS currently achieves 87% translation fidelity from English speech and 97% from German speech. We present the JANUS system along with comparative evaluations of its interchangeable processing components, with special emphasis on the connectionist modules.
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation
Bengio, Yoshua, Mori, Renato De, Flammia, Giovanni, Kompe, Ralf
The subject of this paper is the integration of multi-layered Artificial Neural Networks (ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs. 1 INTRODUCTION This paper studies the integration of Artificial Neural Networks (ANN) with probability density functions (pdf) such as the Gaussian mixtures often used in continuous density Hidden Markov Models. The ANNs considered here are multi-layered or recurrent networks with hyperbolic tangent hidden units.
Connectionist Optimisation of Tied Mixture Hidden Markov Models
Renals, Steve, Morgan, Nelson, Bourlard, Hervé, Franco, Horacio, Cohen, Michael
Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of radial basis functions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some outstanding problems with discriminative training.