Roberts, David
Interactive Learning from Policy-Dependent Human Feedback
MacGlashan, James, Ho, Mark K, Loftin, Robert, Peng, Bei, Wang, Guan, Roberts, David, Taylor, Matthew E., Littman, Michael L.
This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback. Much previous work on this problem has made the assumption that people provide feedback for decisions that is dependent on the behavior they are teaching and is independent from the learner's current policy. We present empirical results that show this assumption to be false -- whether human trainers give a positive or negative feedback for a decision is influenced by the learner's current policy. Based on this insight, we introduce {\em Convergent Actor-Critic by Humans} (COACH), an algorithm for learning from policy-dependent feedback that converges to a local optimum. Finally, we demonstrate that COACH can successfully learn multiple behaviors on a physical robot.
Training an Agent to Ground Commands with Reward and Punishment
MacGlashan, James (Brown University) | Littman, Michael (Brown University) | Loftin, Robert (North Carolina State University) | Peng, Bei (Washington State University) | Roberts, David (North Carolina State University) | Taylor, Matthew (Washington State University)
As robots and autonomous assistants becomemore capable, there will be agreater need for humans to easilyconvey to agents the complex tasks they wantthem to carry out. Conveying tasks throughnatural language provides an intuitive interfacethat does not require any technical expertise,but implementing such an interface requires methods forthe agent to learn a grounding of natural language commands.In this work, we demonstrate how high-level task groundings can belearned from a human trainer providing online reward and punishment.Grounding language to high-level tasks for the agent to solveremoves the need for the human to specify low-level solution details intheir command.Using reward and punishment for trainingmakes the training procedure simple enough to be used by people withouttechnical expertise and also allows a human trainer to immediatelycorrect errors in interpretation that the agent has made. We present preliminary results from a single usertraining an agent in a simple simulated home environment and showthat the agent can quickly learn a grounding oflanguage such that the agent can successfully interpretnew commands and executethem in a variety of different environments.
When Players Quit (Playing Scrabble)
Harrison, Brent (North Carolina State University) | Roberts, David (North Carolina State University)
What features contribute to player enjoyment and player retentionhas been a popular research topic in video games research;however, the question of what causes players to quit agame has received little attention by comparison. In this paper,we examine 5 quantitative features of the game Scrabblesquein order to determine what behaviors are predictors ofa player prematurely ending a game session. We identified afeature transformation that notably improves prediction accuracy.We used a naive Bayes model to determine that there areseveral transformed feature sequences that are accurate predictorsof players terminating game sessions before the endof the game.We also identify several trends that exist in thesesequences to give a more general idea as to what behaviorsare characteristic early indicators of players quitting.
A Review of Student Modeling Techniques in Intelligent Tutoring Systems
Harrison, Brent (North Carolina State University) | Roberts, David (North Carolina State)
In this paper, we survey techniques used in intelligent tutoring systems (ITSs) to model student knowledge. The three techniques that we review in detail are knowledge tracing, performance factor analysis, and matrix factorization. We also briefly cover other techniques that have been used. This review is meant to be a repository of knowledge for those who want to integrate these techniques into serious games. It is also meant to increase awareness and interest as to the techniques available that can be integrated into serious games.