Country
A Neural Net Model for Adaptive Control of Saccadic Accuracy by Primate Cerebellum and Brainstem
Dean, Paul, Mayhew, John E. W., Langdon, Pat
Accurate saccades require interaction between brainstem circuitry and the cerebeJJum. A model of this interaction is described, based on Kawato's principle of feedback-error-Iearning. In the model a part of the brainstem (the superior colliculus) acts as a simple feedback controJJer with no knowledge of initial eye position, and provides an error signal for the cerebeJJum to correct for eye-muscle nonIinearities. This teaches the cerebeJJum, modelled as a CMAC, to adjust appropriately the gain on the brainstem burst-generator's internal feedback loop and so alter the size of burst sent to the motoneurons. With direction-only errors the system rapidly learns to make accurate horizontal eye movements from any starting position, and adapts realistically to subsequent simulated eye-muscle weakening or displacement of the saccadic target.
Reverse TDNN: An Architecture For Trajectory Generation
Trajectory generation finds interesting applications in the field of robotics, automation, filtering, or time series prediction. Neural networks, with their ability to learn from examples, have been proposed very early on for solving nonlinear control problems adaptively. Several neural net architectures have been proposed for trajectory generation, most notably recurrent networks, either with discrete time and externalloops (Jordan, 1986), or with continuous time (Pearlmutter, 1988). Aside from being recurrent, these networks are not specifically tailored for trajectory generation. It has been shown that specific architectures, such as the Time Delay Neural Networks (Lang and Hinton, 1988), or convolutional networks in general, are better than fully connected networks at recognizing time sequences such as speech (Waibel et al., 1989), or pen trajectories (Guyon et al., 1991). We show that special architectures can also be devised for trajectory generation, with dramatic performance improvement.
Fast, Robust Adaptive Control by Learning only Forward Models
A large class of motor control tasks requires that on each cycle the controller is told its current state and must choose an action to achieve a specified, state-dependent, goal behaviour. This paper argues that the optimization of learning rate, the number of experimental control decisions before adequate performance is obtained, and robustness is of prime importance-if necessary at the expense of computation per control cycle and memory requirement. This is motivated by the observation that a robot which requires two thousand learning steps to achieve adequate performance, or a robot which occasionally gets stuck while learning, will always be undesirable, whereas moderate computational expense can be accommodated by increasingly powerful computer hardware. It is not unreasonable to assume the existence of inexpensive 100 Mflop controllers within a few years and so even processes with control cycles in the low tens of milliseconds will have millions of machine instructions in which to make their decisions. This paper outlines a learning control scheme which aims to make effective use of such computational power. 1 MEMORY BASED LEARNING Memory-based learning is an approach applicable to both classification and function learning in which all experiences presented to the learning box are explicitly remembered. The memory, Mem, is a set of input-output pairs, Mem {(Xl, YI), (X21 Y2),..., (Xb Yk)}.
Fast Learning with Predictive Forward Models
A method for transforming performance evaluation signals distal both in space and time into proximal signals usable by supervised learning algorithms, presented in [Jordan & Jacobs 90], is examined. A simple observation concerning differentiation through models trained with redundant inputs (as one of their networks is) explains a weakness in the original architecture and suggests a modification: an internal world model that encodes action-space exploration and, crucially, cancels input redundancy to the forward model is added. Learning time on an example task, cartpole balancing, is thereby reduced about 50 to 100 times. 1 INTRODUCTION In many learning control problems, the evaluation used to modify (and thus improve) control may not be available in terms of the controller's output: instead, it may be in terms of a spatial transformation of the controller's output variables (in which case we shall term it as being "distal in space"), or it may be available only several time steps into the future (termed as being "distal in time"). For example, control of a robot arm may be exerted in terms of joint angles, while evaluation may be in terms of the endpoint cartesian coordinates; furthermore, we may only wish to evaluate the endpoint coordinates reached after a certain period of time: the co- ยทCurrent address: Computation and Neural Systems Program, California Institute of Technology, Pasadena CA.
Refining PID Controllers using Neural Networks
Scott, Gary M., Shavlik, Jude W., Ray, W. Harmon
We apply this method to the task of controlling the outflow and temperature of a water tank, producing statistically-significant gains in accuracy over both a standard neural network approach and a non-learning PID controller. Furthermore, using the PID knowledge to initialize the weights of the network produces statistically less variation in testset accuracy when compared to networks initialized with small random numbers.
Recognition of Manipulated Objects by Motor Learning
We present two neural network controller learning schemes based on feedbackerror-learning and modular architecture for recognition and control of multiple manipulated objects. In the first scheme, a Gating Network is trained to acquire object-specific representations for recognition of a number of objects (or sets of objects). In the second scheme, an Estimation Network is trained to acquire function-specific, rather than object-specific, representations which directly estimate physical parameters. Both recognition networks are trained to identify manipulated objects using somatic and/or visual information. After learning, appropriate motor commands for manipulation of each object are issued by the control networks.
Oscillatory Neural Fields for Globally Optimal Path Planning
A neural network solution is proposed for solving path planning problems faced by mobile robots. The proposed network is a two-dimensional sheet of neurons forming a distributed representation of the robot's workspace. Lateral interconnections between neurons are "cooperative", so that the network exhibits oscillatory behaviour. These oscillations are used to generate solutions of Bellman's dynamic programming equation in the context of path planning. Simulation experiments imply that these networks locate global optimal paths even in the presence of substantial levels of circuit nOlse. 1 Dynamic Programming and Path Planning Consider a 2-DOF robot moving about in a 2-dimensional world. A robot's location is denoted by the real vector, p.
Active Exploration in Dynamic Environments
Thrun, Sebastian B., Mรถller, Knut
Many real-valued connectionist approaches to learning control realize exploration by randomness in action selection. This might be disadvantageous when costs are assigned to "negative experiences". The basic idea presented in this paper is to make an agent explore unknown regions in a more directed manner. This is achieved by a so-called competence map, which is trained to predict the controller's accuracy, and is used for guiding exploration. Based on this, a bistable system enables smoothly switching attention between two behaviors - exploration and exploitation - depending on expected costs and knowledge gain. The appropriateness of this method is demonstrated by a simple robot navigation task.