Reward from Demonstration in Interactive Reinforcement Learning

Raza, Syed Ali (University of Technology, Sydney) | Johnston, Benjamin (University of Technology, Sydney) | Williams, Mary-Anne (University of Technology, Sydney)

May-8-2016–AAAI Conferences

In reinforcement learning (RL), reward shaping is used to show the desirable behavior by assigning positive or negative reward for learner’s preceding action. However, for reward shaping through human-generated rewards, an important aspect is to make it approachable to humans. Typically, a human teacher’s role requires being watchful of agent’s action to assign judgmental feedback based on prior knowledge. It can be a mentally tough and unpleasant exercise especially for lengthy teaching sessions. We present a method, Shaping from Interactive Demonstrations (SfID), which instead of judgmental reward takes action label from human. Therefore, it simplifies the teacher’s role to demonstrating the action to select from a state. We compare SfID with a standard reward shaping approach on Sokoban domain. The results show the competitiveness of SfID with the standard reward shaping.

artificial intelligence, interactive reinforcement learning, machine learning, (1 more...)

AAAI Conferences

May-8-2016

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found