Learning via Human Feedback in Continuous State and Action Spaces

Ngo, Vien Anh (Ravensburg-Weingarten University of Applied Sciences) | Ertel, Wolfgang (Ravensburg-Weingarten University of Applied Sciences)

AAAI Conferences 

We consider the problem of extending manually trainedagents via evaluative reinforcement (TAMER) in con-tinuous state and action spaces. The early work TAMERframework allows a non-technical human to train anagent through a natural form of human feedback, neg-ative or positive. The advantages of TAMER havebeen shown on applications such as training Tetris andMountain Car with only human feedback, Cart-poleand Mountain Car with human feedback and environ-ment reward (augmenting reinforcement learning withhuman feedback). However, those methods are origi-nally designed for discrete state-action, or continuousstate-discrete action problems. In this paper, we intro-duce an extension of TAMER to allow both continu-ous states and actions. The new scheme, actor-criticTAMER, extends the original TAMER to allow usingany general function approximation of a human trainer’sreinforcement signal. Our extension still allows rein-forcement learning to be easily combined with humanfeedback. The experimental results show that the pro-posed method helps a human trainer successfully trainan agent in two continuous state-action domains: Moun-tain Car, and Cart-pole (balancing).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found