Europe
Reinforcement Learning Methods for Continuous-Time Markov Decision Problems
Bradtke, Steven J., Duff, Michael O.
Semi-Markov Decision Problems are continuous time generalizations ofdiscrete time Markov Decision Problems. A number of reinforcement learning algorithms have been developed recently for the solution of Markov Decision Problems, based on the ideas of asynchronous dynamic programming and stochastic approximation. Amongthese are TD(,x), Q-Iearning, and Real-time Dynamic Programming. After reviewing semi-Markov Decision Problems and Bellman's optimality equation in that context, we propose algorithms similarto those named above, adapted to the solution of semi-Markov Decision Problems. We demonstrate these algorithms by applying them to the problem of determining the optimal control fora simple queueing system. We conclude with a discussion of circumstances under which these algorithms may be usefully applied. 1 Introduction A number of reinforcement learning algorithms based on the ideas of asynchronous dynamic programming and stochastic approximation have been developed recently for the solution of Markov Decision Problems.
Dynamic Cell Structures
Dynamic Cell Structures (DCS) represent a family of artificial neural architectures suited both for unsupervised and supervised learning. They belong to the recently [Martinetz94] introduced class of Topology Representing Networks (TRN) which build perlectly topology preserving featuremaps. DCS empI'oy a modified Kohonen learning rule in conjunction with competitive Hebbian learning. The Kohonen type learning rule serves to adjust the synaptic weight vectors while Hebbian learning establishes a dynamic lateral connection structure between the units reflecting the topology of the feature manifold.
Capacity and Information Efficiency of a Brain-like Associative Net
Graham, Bruce, Willshaw, David
Bruce Graham and David Willshaw Centre for Cognitive Science, University of Edinburgh 2 Buccleuch Place, Edinburgh, EH8 9LW, UK Email: bruce@cns.ed.ac.uk&david@cns.ed.ac.uk Abstract We have determined the capacity and information efficiency of an associative net configured in a brain-like way with partial connectivity andnoisy input cues. Recall theory was used to calculate the capacity when pattern recall is achieved using a winners-takeall strategy.Transforming the dendritic sum according to input activity and unit usage can greatly increase the capacity of the associative net under these conditions. This corresponds to the level of connectivity commonly seen in the brain and invites speculation that the brain is connected in the most information efficient way. 1 INTRODUCTION Standard network associative memories become more plausible as models of associative memoryin the brain if they incorporate (1) partial connectivity, (2) sparse activity and (3) recall from noisy cues. In this paper we consider the capacity of a binary associative net (Willshaw, Buneman, & Longuet-Higgins, 1969; Willshaw, 1971; Buckingham, 1991) containing these features. While the associative net is a very simple model of associative memory, its behaviour as a storage device is not trivial and yet it is tractable to theoretical analysis. We are able to calculate 514 BruceGraham, David Willshaw the capacity of the net in different configurations and with different pattern recall strategies.
Pairwise Neural Network Classifiers with Probabilistic Outputs
Price, David, Knerr, Stefan, Personnaz, Lรฉon, Dreyfus, Gรฉrard
Multi-class classification problems can be efficiently solved by partitioning the original problem into sub-problems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of the two-class neural networks in order to obtain posterior probabilities for the class decisions. The resulting probabilistic pairwise classifier is part of a handwriting recognition system which is currently applied to check reading. We present results on real world data bases and show that, from a practical point of view, these results compare favorably to other neural network approaches.
Real-Time Control of a Tokamak Plasma Using Neural Networks
Bishop, Chris M., Haynes, Paul S., Smith, Mike E U, Todd, Tom N., Trotman, David L., Windsor, Colin G.
This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a tokamak fusion experiment. The tokamak is currently the principal experimentaldevice for research into the magnetic confinement approachto controlled fusion. In the tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a timescale of a few tens of microseconds. Softwaresimulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multilayer perceptron, using a hybrid of digital and analogue technology, has been developed.
PCA-Pyramids for Image Compression
First, we show that we can use neural networks in a pyramidal framework,yielding the so-called PCA pyramids. Then we present an image compression method based on the PCA pyramid, which is similar to the Laplace pyramid and wavelet transform. Some experimental results with real images are reported. Finally, we present a method to combine the quantization step with the learning of the PCA pyramid. 1 Introduction In the past few years, a lot of work has been done on using neural networks for image compression, d .
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based)to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically richsentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, themodel included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.
Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts
Waterhouse, Steve R., Robinson, Anthony J.
We are concerned in this paper with the application of multiple models, specifically theHierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S.R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines(MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictorstrained on different statistical classes of speech.