Europe
Learning from Demonstration
By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speedup after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor.
Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes
A CDP can always be discretized in state space and time and thus reduced to a Markov Decision Problem. Algorithms like Q-Iearning and RTDP as described in [1] can then be applied to produce controls or optimal value functions for a fixed discretization. Problems arise when the discretization needs to be refined, or when multi-grid information needs to be extracted to accelerate the algorithm. The relation of time to state space discretization parameters is crucial in both cases. Therefore 1034 S. Pareigis a mathematical model of the discretized process is introduced, which reflects the properties of the converged empirical process.
Multi-effect Decompositions for Financial Data Modeling
High frequency foreign exchange data can be decomposed into three components: the inventory effect component, the surprise infonnation (news) component and the regular infonnation component. The presence of the inventory effect and news can make analysis of trends due to the diffusion of infonnation (regular information component) difficult. We propose a neural-net-based, independent component analysis to separate high frequency foreign exchange data into these three components. Our empirical results show that our proposed multi-effect decomposition can reveal the intrinsic price behavior.
Sequential Tracking in Pricing Financial Options using Model Based and Neural Network Approaches
This paper shows how the prices of option contracts traded in financial markets can be tracked sequentially by means of the Extended Kalman Filter algorithm. I consider call and put option pairs with identical strike price and time of maturity as a two output nonlinear system. The Black-Scholes approach popular in Finance literature and the Radial Basis Functions neural network are used in modelling the nonlinear system generating these observations. I show how both these systems may be identified recursively using the EKF algorithm. I present results of simulations on some FTSE 100 Index options data and discuss the implications of viewing the pricing problem in this sequential manner. 1 INTRODUCTION Data from the financial markets has recently been of much interest to the neural computing community. The complexity of the underlying macroeconomic system and how traders react to the flow of information leads to highly nonlinear relationships between observations.
Predicting Lifetimes in Dynamically Allocated Memory
Cohn, David A., Singh, Satinder P.
Predictions oflifetimes of dynamically allocated objects can be used to improve time and space efficiency of dynamic memory management in computer programs. Barrett and Zorn [1993] used a simple lifetime predictor and demonstrated this improvement on a variety of computer programs. In this paper, we use decision trees to do lifetime prediction on the same programs and show significantly better prediction. Our method also has the advantage that during training we can use a large number of features and let the decision tree automatically choose the relevant subset.
Rapid Visual Processing using Spike Asynchrony
Thorpe, Simon J., Gautrais, Jacques
We have investigated the possibility that rapid processing in the visual system could be achieved by using the order of firing in different neurones as a code, rather than more conventional firing rate schemes. Using SPIKENET, a neural net simulator based on integrate-and-fire neurones and in which neurones in the input layer function as analogto-delay converters, we have modeled the initial stages of visual processing. Initial results are extremely promising. Even with activity in retinal output cells limited to one spike per neuron per image (effectively ruling out any form of rate coding), sophisticated processing based on asynchronous activation was nonetheless possible.
Compositionality, MDL Priors, and Object Recognition
Bienenstock, Elie, Geman, Stuart, Potter, Daniel
Images are ambiguous at each of many levels of a contextual hierarchy. Nevertheless, the high-level interpretation of most scenes is unambiguous, as evidenced by the superior performance of humans. This observation argues for global vision models, such as deformable templates. Unfortunately, such models are computationally intractable for unconstrained problems. We propose a compositional model in which primitives are recursively composed, subject to syntactic restrictions, to form tree-structured objects and object groupings. Ambiguity is propagated up the hierarchy in the form of multiple interpretations, which are later resolved by a Bayesian, equivalently minimum-description-Iength, cost functional.
Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks
Bartlett, Marian Stewart, Sejnowski, Terrence J.
We have explored two approaches to recogmzmg faces across changes in pose. First, we developed a representation of face images based on independent component analysis (ICA) and compared it to a principal component analysis (PCA) representation for face recognition. The ICA basis vectors for this data set were more spatially local than the PCA basis vectors and the ICA representation had greater invariance to changes in pose. Second, we present a model for the development of viewpoint invariant responses to faces from visual experience in a biological system. The temporal continuity of natural visual experience was incorporated into an attractor network model by Hebbian learning following a lowpass temporal filter on unit activities.
Effective Training of a Neural Network Character Classifier for Word Recognition
Yaeger, Larry S., Lyon, Richard F., Webb, Brandyn J.
We have been conducting research on bottom-up classification techniques ba;ed on trainable artificial neural networks (ANNs), in combination with comprehensive but weakly-applied language models. To focus our work on a subproblem that is tractable enough to le.:'ld to usable products in a reasonable time, we have restricted the domain to hand-printing, so that strokes are clearly delineated by pen lifts. In the process of optimizing overall performance of the recognizer, we have discovered some useful techniques for architecting and training ANNs that must participate in a larger recognition process. Some of these techniques-especially the normalization of output error, frequency balanCing, and error emphal;is-suggest a common theme of significant value derived by reducing the effect of a priori biases in training data to better represent low frequency, low probability smnples, including second and third choice probabilities. There is mnple prior work in combining low-level classifiers with various search strategies to provide integrated segmentation and recognition for writing (Tappert et al 1990) and speech (Renals et aI1992). And there is a rich background in the use of ANNs a-; classifiers, including their use as a low-level, character classifier in a higher-level word recognition system (Bengio et aI1995).
Ensemble Methods for Phoneme Classification
Waterhouse, Steve R., Cook, Gary
There is now considerable interest in using ensembles or committees of learning machines to improve the performance of the system over that of a single learning machine. In most neural network ensembles, the ensemble members are trained on either the same data (Hansen & Salamon 1990) or different subsets of the data (Perrone & Cooper 1993). The ensemble members typically have different initial conditions and/or different architectures. The subsets of the data may be chosen at random, with prior knowledge or by some principled approach e.g.