Learning Spatio-Temporal Planning from a Dynamic Programming Teacher: Feed-Forward Neurocontrol for Moving Obstacle Avoidance
Fahner, Gerald, Eckmiller, Rolf
Within a simple test-bed, application of feed-forward neurocontrol for short-term planning of robot trajectories in a dynamic environment is studied. The action network is embedded in a sensorymotoric system architecture that contains a separate world model. It is continuously fed with short-term predicted spatiotemporal obstacle trajectories, and receives robot state feedback. The action net allows for external switching between alternative planning tasks. It generates goal-directed motor actions - subject to the robot's kinematic and dynamic constraints - such that collisions with moving obstacles are avoided.
Feudal Reinforcement Learning
Dayan, Peter, Hinton, Geoffrey E.
One way to speed up reinforcement learning is to enable learning to happen simultaneously at multiple resolutions in space and time. This paper shows how to create a Q-Iearning managerial hierarchy in which high level managers learn how to set tasks to their submanagers who, in turn, learn how to satisfy them. Sub-managers need not initially understand their managers' commands. They simply learn to maximise their reinforcement in the context of the current command. We illustrate the system using a simple maze task.. As the system learns how to get around, satisfying commands at the multiple levels, it explores more efficiently than standard, flat, Q-Iearning and builds a more comprehensive map. 1 INTRODUCTION Straightforward reinforcement learning has been quite successful at some relatively complex tasks like playing backgammon (Tesauro, 1992).
A Boundary Hunting Radial Basis Function Classifier which Allocates Centers Constructively
Chang, Eric I., Lippmann, Richard P.
A new boundary hunting radial basis function (BH-RBF) classifier which allocates RBF centers constructively near class boundaries is described. This classifier creates complex decision boundaries only in regions where confusions occur and corresponding RBF outputs are similar. A predicted square error measure is used to determine how many centers to add and to determine when to stop adding centers. Two experiments are presented which demonstrate the advantages of the BH RBF classifier. One uses artificial data with two classes and two input features where each class contains four clusters but only one cluster is near a decision region boundary.
Attractor Neural Networks with Local Inhibition: from Statistical Physics to a Digitial Programmable Integrated Circuit
Networks with local inhibition are shown to have enhanced computational performance with respect to the classical Hopfield-like networks. In particular the critical capacity of the network is increased as well as its capability to store correlated patterns. Chaotic dynamic behaviour (exponentially long transients) of the devices indicates the overloading of the associative memory. An implementation based on a programmable logic device is here presented. A 16 neurons circuit is implemented whit a XILINK 4020 device.
Modeling Consistency in a Speaker Independent Continuous Speech Recognition System
Konig, Yochai, Morgan, Nelson, Wooters, Chuck, Abrash, Victor, Cohen, Michael, Franco, Horacio
We would like to incorporate speaker-dependent consistencies, such as gender, in an otherwise speaker-independent speech recognition system. In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recognition system. The gender classification net predicts the gender with high accuracy, 98.3% on a Resource Management test set. However, the integration of the GDNN into our hybrid HMM-neural network recognizer provided an improvement in the recognition score that is not statistically significant on a Resource Management test set.
Information, Prediction, and Query by Committee
Freund, Yoav, Seung, H. Sebastian, Shamir, Eli, Tishby, Naftali
We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the number of queries. We show that, in particular, this exponential decrease holds for query learning of thresholded smooth functions.