Fast Learning with Predictive Forward Models

Brody, Carlos

Neural Information Processing Systems 

A method for transforming performance evaluation signals distal both in space and time into proximal signals usable by supervised learning algorithms, presented in [Jordan & Jacobs 90], is examined. A simple observation concerning differentiation through models trained with redundant inputs (as one of their networks is) explains a weakness in the original architecture and suggests a modification: an internal world model that encodes action-space exploration and, crucially, cancels input redundancy to the forward model is added. Learning time on an example task, cartpole balancing, is thereby reduced about 50 to 100 times. 1 INTRODUCTION In many learning control problems, the evaluation used to modify (and thus improve) control may not be available in terms of the controller's output: instead, it may be in terms of a spatial transformation of the controller's output variables (in which case we shall term it as being "distal in space"), or it may be available only several time steps into the future (termed as being "distal in time"). For example, control of a robot arm may be exerted in terms of joint angles, while evaluation may be in terms of the endpoint cartesian coordinates; furthermore, we may only wish to evaluate the endpoint coordinates reached after a certain period of time: the co- ·Current address: Computation and Neural Systems Program, California Institute of Technology, Pasadena CA.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found