Connecting Cognitive and Physical Worlds with Dynamic Cost Function Definition

AAAI Conferences

Our goal is to mesh the symbolic reasoning capabilities of a cognitive model with the constrained optimization possibilities inherent in optimal controls. We plan to develop and test such a system for several different dynamical models in environments of differing certainty and differing efficiency requirements.


A Practice Strategy for Robot Learning Control

Neural Information Processing Systems

The most general definition of Adaptive Control is one which includes any controller whose behavior changes in response to the controlled system's behavior. In practice, this definition is usually restricted to modifying a small number of controller parameters inorder to maintain system stability or global asymptotic stability of the errors during execution of a single trajectory (Sastry and Bodson 1989, for review). Learning Control represents a second level of operation, since it uses Adaptive Con-335 336 Sanger trol to modify parameters during repeated performance trials of a desired trajectory so that future trials result in greater accuracy (Arimoto et al. 1984). In this paper I present a third level called a "Practice Strategy", in which Learning Control is applied to a sequence of intermediate trajectories leading ultimately to the true desired trajectory. I claim that this can significantly increase learning speed and make learning possible for systems which would otherwise become unstable.



On the use of Hybrid Control for Legged Locomotion

AAAI Conferences

In this paper, we develop a hybrid control approach for legged locomotion. We motivate the development of the control architecture using the results of a series of walking, running and obstacle climbing experiments conducted using a six legged robot called HEX. Our initial simulation results indicate the potential stability of the control approach, and our future analytical work should provide the formal proof of these results. I. Introduction It is well known that legged locomotion involves the use of prototypical movements wherein the phase, frequency and amplitude of individual leg motions are related to one another in specific ways. In the literature, such movements are referred to as gaits.


Data Efficient Reinforcement Learning for Legged Robots

arXiv.org Artificial Intelligence

We present a model-based framework for robot locomotion that achieves walking based on only 4.5 minutes (45,000 control steps) of data collected on a quadruped robot. To accurately model the robot's dynamics over a long horizon, we introduce a loss function that tracks the model's prediction over multiple timesteps. We adapt model predictive control to account for planning latency, which allows the learned model to be used for real time control. Additionally, to ensure safe exploration during model learning, we embed prior knowledge of leg trajectories into the action space. The resulting system achieves fast and robust locomotion. Unlike model-free methods, which optimize for a particular task, our planner can use the same learned dynamics for various tasks, simply by changing the reward function. To the best of our knowledge, our approach is more than an order of magnitude more sample efficient than current model-free methods.