An Actor/Critic Algorithm that is Equivalent to Q-Learning

Dec-31-1995–Neural Information Processing Systems

We prove the convergence of an actor/critic algorithm that is equivalent toQ-Iearning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor andcritic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using criteria thatdepend on the relative probability of the action that was executed.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Dec-31-1995

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.15)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
An Actor/Critic Algorithm that is Equivalent to Q-Learning
An Actor/Critic Algorithm that is Equivalent to Q-Learning

Similar Docs Excel Report more

Title	Similarity	Source
None found