Learning via Human Feedback in Continuous State and Action Spaces

Ngo, Vien Anh (Ravensburg-Weingarten University of Applied Sciences) | Ertel, Wolfgang (Ravensburg-Weingarten University of Applied Sciences)

Nov-5-2012–AAAI Conferences

We consider the problem of extending manually trainedagents via evaluative reinforcement (TAMER) in con-tinuous state and action spaces. The early work TAMERframework allows a non-technical human to train anagent through a natural form of human feedback, neg-ative or positive. The advantages of TAMER havebeen shown on applications such as training Tetris andMountain Car with only human feedback, Cart-poleand Mountain Car with human feedback and environ-ment reward (augmenting reinforcement learning withhuman feedback). However, those methods are origi-nally designed for discrete state-action, or continuousstate-discrete action problems. In this paper, we intro-duce an extension of TAMER to allow both continu-ous states and actions. The new scheme, actor-criticTAMER, extends the original TAMER to allow usingany general function approximation of a human trainer’sreinforcement signal. Our extension still allows rein-forcement learning to be easily combined with humanfeedback. The experimental results show that the pro-posed method helps a human trainer successfully trainan agent in two continuous state-action domains: Moun-tain Car, and Cart-pole (balancing).

artificial intelligence, machine learning, reinforcement learning, (18 more...)

AAAI Conferences

Nov-5-2012

Conferences PDF

Add feedback

Country:
- Europe > Germany (0.04)
- North America > United States
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Colorado > Denver County
    - Denver (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found