Efficient Nonlinear Control with Actor-Tutor Architecture

Dec-31-1997–Neural Information Processing Systems

Itwas demonstrated in the simulation of a pendulum swing-up task that the value-gradient based control scheme requires much less learning trials than the conventional "actor-critic"control scheme (Barto et al., 1983). In the actor-critic scheme, the actor, a direct feedback controller, improves its control policystochastically using the TD error as the effective reinforcement (Figure 1a). Despite its relatively slow learning, the actor-critic architecture has the virtue of simple computation in generating control command. In order to train a direct controller while making efficient use of the value function, we propose a new reinforcement learning scheme which we call the "actor-tutor" architecture (Figure 1b).

artificial intelligence, health & medicine, value function, (16 more...)

Neural Information Processing Systems

Dec-31-1997

Conferences PDF

Add feedback

Country:
- North America > United States (0.29)

Industry:
- Health & Medicine (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Duplicate Docs Excel Report

Title
Efficient Nonlinear Control with Actor-Tutor Architecture
Efficient Nonlinear Control with Actor-Tutor Architecture

Similar Docs Excel Report more

Title	Similarity	Source
None found