Country
Sequential Decision Problems and Neural Networks
Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.
Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.
The Perceptron Algorithm Is Fast for Non-Malicious Distributions
Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.
Coupled Markov Random Fields and Mean Field Theory
Geiger, Davi, Girosi, Federico
In recent years many researchers have investigated the use of Markov Random Fields (MRFs) for computer vision. They can be applied for example to reconstruct surfaces from sparse and noisy depth data coming from the output of a visual process, or to integrate early vision processes to label physical discontinuities. In this paper we show that by applying mean field theory to those MRFs models a class of neural networks is obtained. Those networks can speed up the solution for the MRFs models. The method is not restricted to computer vision. 1 Introduction
Synergy of Clustering Multiple Back Propagation Networks
Lincoln, William P., Skrzypek, Josef
The properties of a cluster of multiple back-propagation (BP) networks are examined and compared to the performance of a single BP network. The underlying idea is that a synergistic effect within the cluster improves the perfonnance and fault tolerance. Five networks were initially trained to perfonn the same input-output mapping. Following training, a cluster was created by computing an average of the outputs generated by the individual networks. The output of the cluster can be used as the desired output during training by feeding it back to the individual networks.
Subgrouping Reduces Complexity and Speeds Up Learning in Recurrent Networks
Recurrent nets are more powerful than feedforward nets because they allow simulation of dynamical systems. Everything from sine wave generators through computers to the brain are potential candidates, but to use recurrent nets to emulate dynamical systems we need learning algorithms to program them. Here I describe a new twist on an old algorithm for recurrent nets and compare it to its predecessors.
Asymptotic Convergence of Backpropagation: Numerical Experiments
Ahmad, Subutai, Tesauro, Gerald, He, Yu
We have calculated, both analytically and in simulations, the rate of convergence at long times in the backpropagation learning algorithm for networks with and without hidden units. Our basic finding for units using the standard sigmoid transfer function is lit convergence of the error for large t, with at most logarithmic corrections for networks with hidden units. Other transfer functions may lead to a 8lower polynomial rate of convergence. Our analytic calculations were presented in (Tesauro, He & Ahamd, 1989). Here we focus in more detail on our empirical measurements of the convergence rate in numerical simulations, which confirm our analytic results.
Optimal Brain Damage
LeCun, Yann, Denker, John S., Solla, Sara A.
We have used information-theoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved speed of learning and/or classification. The basic idea is to use second-derivative information to make a tradeoff between network complexity and training set error. Experiments confirm the usefulness of the methods on a real-world application. 1 INTRODUCTION Most successful applications of neural network learning to real-world problems have been achieved using highly structured networks of rather large size [for example (Waibel, 1989; Le Cun et al., 1990a)]. As applications become more complex, the networks will presumably become even larger and more structured.
Performance Comparisons Between Backpropagation Networks and Classification Trees on Three Real-World Applications
Atlas, Les E., Cole, Ronald A., Connor, Jerome T., El-Sharkawi, Mohamed A., II, Robert J. Marks, Muthusamy, Yeshwant K., Barnard, Etienne
In this paper we compare regression and classification systems. A regression system can generate an output f for an input X, where both X and f are continuous and, perhaps, multidimensional. A classification system can generate an output class, C, for an input X, where X is continuous and multidimensional and C is a member of a finite alphabet. The statistical technique of Classification And Regression Trees (CART) was developed during the years 1973 (Meisel and Michalpoulos) through 1984 (Breiman el al).
A Method for the Associative Storage of Analog Vectors
Atiya, Amir F., Abu-Mostafa, Yaser S.
A method for storing analog vectors in Hopfield's continuous feedback model is proposed. By analog vectors we mean vectors whose components are real-valued. The vectors to be stored are set as equilibria of the network. The network model consists of one layer of visible neurons and one layer of hidden neurons. We propose a learning algorithm, which results in adjusting the positions of the equilibria, as well as guaranteeing their stability.