Not enough data to create a plot.
Try a different view from the menu above.
Convergence of Stochastic Iterative Dynamic Programming Algorithms
Jaakkola, Tommi, Jordan, Michael I., Singh, Satinder P.
Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problemsinvolving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DPbased learning algorithms to the powerful techniquesof stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD("\) and Q-Iearning belong. 1 INTRODUCTION Learning to predict the future and to find an optimal way of controlling it are the basic goals of learning systems that interact with their environment. A variety of algorithms are currently being studied for the purposes of prediction and control in incompletely specified, stochastic environments. Here we consider learning algorithms definedin Markov environments. There are actions or controls (u) available for the learner that affect both the state transition probabilities, and the probability distributionfor the immediate, state dependent costs (Ci( u)) incurred by the learner.
The Power of Amnesia
Ron, Dana, Singer, Yoram, Tishby, Naftali
We propose a learning algorithm for a variable memory length Markov process. Human communication, whether given as text, handwriting, or speech, has multi characteristic time scales. On short scales it is characterized mostly by the dynamics that generate theprocess, whereas on large scales, more syntactic and semantic informationis carried. For that reason the conventionally used fixed memory Markov models cannot capture effectively the complexity of such structures. On the other hand using long memory modelsuniformly is not practical even for as short memory as four.
Learning Complex Boolean Functions: Algorithms and Applications
Oliveira, Arlindo L., Sangiovanni-Vincentelli, Alberto
The most commonly used neural network models are not well suited to direct digital implementations because each node needs to perform alarge number of operations between floating point values. Fortunately, the ability to learn from examples and to generalize is not restricted to networks ofthis type. Indeed, networks where each node implements a simple Boolean function (Boolean networks) can be designed in such a way as to exhibit similar properties. Two algorithms that generate Boolean networks from examples are presented. Theresults show that these algorithms generalize very well in a class of problems that accept compact Boolean network descriptions.
Robust Parameter Estimation and Model Selection for Neural Network Regression
In this paper, it is shown that the conventional back-propagation (BPP) algorithm for neural network regression is robust to leverages (datawith:n corrupted), but not to outliers (data with y corrupted). A robust model is to model the error as a mixture of normal distribution. The influence function for this mixture model is calculated and the condition for the model to be robust to outliers is given. EM algorithm [5] is used to estimate the parameter. The usefulness of model selection criteria is also discussed.
Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data
I propose a learning algorithm for learning hierarchical models for object recognition.The model architecture is a compositional hierarchy that represents part-whole relationships: parts are described in the local contextof substructures of the object. The focus of this report is learning hierarchical models from data, i.e. inducing the structure of model prototypes from observed exemplars of an object. At each node in the hierarchy, a probability distribution governing its parameters must be learned. The connections between nodes reflects the structure of the object. The formulation of substructures is encouraged such that their parts become conditionally independent.
Hoo Optimality Criteria for LMS and Backpropagation
Hassibi, Babak, Sayed, Ali H., Kailath, Thomas
This fact provides a theoretical justification of the widely observed excellent robustness properties of the LMS and backpropagation algorithms. We further discuss some implications of these results. 1 Introduction The LMS algorithm was originally conceived as an approximate recursive procedure that solves the following problem (Widrow and Hoff, 1960): given a sequence of n x 1 input column vectors {hd, and a corresponding sequence of desired scalar responses { di
Optimal Brain Surgeon: Extensions and performance comparisons
Hassibi, Babak, Stork, David G., Wolff, Gregory
We extend Optimal Brain Surgeon (OBS) - a second-order method for pruning networks - to allow for general error measures, and explore a reduced computational and storage implementation via a dominant eigenspace decomposition. Simulations on nonlinear, noisy pattern classification problems reveal that OBS does lead to improved generalization, and performs favorably in comparison with Optimal Brain Damage (OBD). We find that the required retraining steps in OBD may lead to inferior generalization, that can be interpreted as due to injecting noise backa result the system. A common technique is to stop training of a largeinto at the minimum validation error. We found that the testnetwork error could be reduced even further by means of OBS (but not OBD) pruning.