gaussian process temporal difference method
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
The Octopus arm is a highly versatile and complex limb. How the Octo- pus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical prin- ciples may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning al- gorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm.
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry
The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry
The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry
The Octopus arm is a highly versatile and complex limb. How the Octopus controlssuch a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles mayrender present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, basedon a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithmto this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)