AITopics | Volkinshtein, Dmitry

Collaborating Authors

Volkinshtein, Dmitry

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

Castro, Dotan D., Volkinshtein, Dmitry, Meir, Ron

Neural Information Processing SystemsDec-31-2009

Actor-critic algorithms for reinforcement learning are achieving renewed popularity dueto their good convergence properties in situations where other approaches often fail (e.g., when function approximation is involved). Interestingly, there is growing evidence that actor-critic approaches based on phasic dopamine signals play a key role in biological learning through cortical and basal ganglia loops. We derive a temporal difference based actor critic learning algorithm, for which convergence can be proved without assuming widely separated time scales for the actor and the critic. The approach is demonstrated by applying it to networks of spiking neurons. The established relation between phasic dopamine and the temporal difference signal lends support to the biological relevance of such algorithms.

algorithm, artificial intelligence, health & medicine, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.14)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry

Neural Information Processing SystemsDec-31-2006

The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

bayesian inference, experiment, health & medicine, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.28)
Asia > Middle East > Israel (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry

Neural Information Processing SystemsDec-31-2006

The Octopus arm is a highly versatile and complex limb. How the Octopus controlssuch a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles mayrender present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, basedon a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithmto this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

bayesian inference, experiment, health & medicine, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.28)
Asia > Middle East > Israel (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback