Country
Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning
Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor lambda. Currently the most important application of these methods is to temporal credit assignment in reinforcement learning. Well known reinforcement learning algorithms, such as AHC or Q-learning, may be viewed as instances of TD learning. This paper examines the issues of the efficient and general implementation of TD(lambda) for arbitrary lambda, for use with reinforcement learning algorithms optimizing the discounted sum of rewards. The traditional approach, based on eligibility traces, is argued to suffer from both inefficiency and lack of generality. The TTD (Truncated Temporal Differences) procedure is proposed as an alternative, that indeed only approximates TD(lambda), but requires very little computation per action and can be used with arbitrary function representation methods. The idea from which it is derived is fairly simple and not new, but probably unexplored so far. Encouraging experimental results are presented, suggesting that using lambda > 0 with the TTD procedure allows one to obtain a significant learning speedup at essentially the same cost as usual TD(0) learning.
Bayesian Backpropagation Over I-O Functions Rather Than Weights
The conventional Bayesian justification of backprop is that it finds the MAP weight vector. As this paper shows, to find the MAP io function instead one must add a correction tenn to backprop. That tenn biases one towards io functions with small description lengths, and in particular favors (somekinds of) feature-selection, pruning, and weight-sharing.
Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization
Mangasarian, O. L., Solodov, M. V.
The fundamental backpropagation (BP) algorithm for training artificial neuralnetworks is cast as a deterministic nonmonotone perturbed gradientmethod. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error function. Theresults presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay. 1 INTRODUCTION
Bayesian Backpropagation Over I-O Functions Rather Than Weights
The conventional Bayesian justification of backprop is that it finds the MAP weight vector. As this paper shows, to find the MAP io function instead one must add a correction tenn to backprop. That tenn biases one towards io functions with small description lengths, and in particular favors (some kinds of) feature-selection, pruning, and weight-sharing.
Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization
Mangasarian, O. L., Solodov, M. V.
The fundamental backpropagation (BP) algorithm for training artificial neural networks is cast as a deterministic nonmonotone perturbed gradient method. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error function. The results presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay. 1 INTRODUCTION
Bayesian Backpropagation Over I-O Functions Rather Than Weights
The conventional Bayesian justification of backprop is that it finds the MAP weight vector. As this paper shows, to find the MAP io function instead one must add a correction tenn to backprop. That tenn biases one towards io functions with small description lengths, and in particular favors (some kinds of) feature-selection, pruning, and weight-sharing.
Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization
Mangasarian, O. L., Solodov, M. V.
The fundamental backpropagation (BP) algorithm for training artificial neural networks is cast as a deterministic nonmonotone perturbed gradient method. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error function. The results presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay. 1 INTRODUCTION
Analyzing Cross-Connected Networks
Shultz, Thomas R., Elman, Jeffrey L.
The nonlinear complexities of neural networks make network solutions difficult to understand. Sanger's contributionanalysis is here extended to the analysis of networks automatically generated by the cascadecorrelation learning algorithm. Because such networks have cross of hiddenconnections that supersede hidden layers, standard analyses contribution is defined as theunit activation patterns are insufficient. A of an output weight and the associated activation on the sendingproduct unit, whether that sending unit is an input or a hidden unit, multiplied by the sign of the output target for the current input pattern.
Neurobiology, Psychophysics, and Computational Models of Visual Attention
Niebur, Ernst, Olshausen, Bruno A.
The purpose of this workshop was to discuss both recent experimental findings and computational models of the neurobiological implementation of selective attention. Recent experimental results were presented in two of the four presentations given (C.E. Connor, Washington University and B.C. Motter, SUNY and V.A. Medical Center, Syracuse), while the other two talks were devoted to computational models (E. Connor presented the results of an experiment in which the receptive field profiles of V 4 neurons were mapped during different states of attention in an awake, behaving monkey. The attentional focus was manipulated in this experiment by altering the position of a behaviorally relevant ring-shaped stimulus.
Non-Intrusive Gaze Tracking Using Artificial Neural Networks
Baluja, Shumeet, Pomerleau, Dean
We have developed an artificial neural network based gaze tracking system which can be customized to individual users. Unlike other gaze trackers, which normally require the user to wear cumbersome headgear, or to use a chin rest to ensure head immobility, our system is entirely non-intrusive. Currently, the best intrusive gaze tracking systems are accurate to approximately 0.75 degrees. In our experiments, we have been able to achieve an accuracy of 1.5 degrees, while allowing head mobility. In this paper we present an empirical analysis of the performance of a large number of artificial neural network architectures for this task.