Between Instruction and Reward: Human-Prompted Switching
Pilarski, Patrick M. (University of Alberta) | Sutton, Richard S. (University of Alberta)
Intelligent systems promise to amplify, augment, and extend innate human abilities. A principal example is that of assistive rehabilitation robots---artificial intelligence and machine learning enable new electromechanical systems that restore biological functions lost through injury or illness. In order for an intelligent machine to assist a human user, it must be possible for a human to communicate their intentions and preferences to their non-human counterpart. While there are a number of techniques that a human can use to direct a machine learning system, most research to date has focused on the contrasting strategies of instruction and reward. The primary contribution of our work is to demonstrate that the middle ground between instruction and reward is a fertile space for research and immediate technological progress. To support this idea, we introduce the setting of human-prompted switching, and illustrate the successful combination of switching with interactive learning using a concrete real-world example: human control of a multi-joint robot arm. We believe techniques that fall between the domains of instruction and reward are complementary to existing approaches, and will open up new lines of rapid progress for interactive human training of machine learning systems.
Nov-5-2012
- Country:
- Europe > Switzerland
- North America > Canada
- Alberta (0.28)
- Industry:
- Education > Educational Setting
- Online (0.49)
- Health & Medicine > Consumer Health (0.46)
- Education > Educational Setting
- Technology: