Goto

Collaborating Authors

 Griffith, Shane


Policy Shaping: Integrating Human Feedback with Reinforcement Learning

Neural Information Processing Systems

A long term goal of Interactive Reinforcement Learning is to incorporate non-expert human feedback to solve complex tasks. State-of-the-art methods have approached this problem by mapping human information to reward and value signals to indicate preferences and then iterating over them to compute the necessary control policy. In this paper we argue for an alternate, more effective characterization of human feedback: Policy Shaping. We introduce Advise, a Bayesian approach that attempts to maximize the information gained from human feedback by utilizing it as direct labels on the policy. We compare Advise to state-of-the-art approaches and highlight scenarios where it outperforms them and importantly is robust to infrequent and inconsistent human feedback.


Interactive Categorization of Containers and Non-Containers by Unifying Categorizations Derived from Multiple Exploratory Behaviors

AAAI Conferences

The ability to form object categories is an important milestone in human infant development (Cohen 2003). We propose a framework that allows a robot to form a unified object categorization from several interactions with objects. This framework is consistent with the principle that robot a) Drop Block b) Grasp c) Move learning should be ultimately grounded in the robot's perceptual and behavioral repertoire (Stoytchev 2009). This paper builds upon our previous work (Griffith et al. 2009) by adding more exploratory behaviors (now 6 instead of 1) and by employing consensus clustering for finding a single, unified object categorization. The framework was tested on a container/non-container categorization task with 20 objects.