Mohseni-Kabir, Anahita
Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals
Mohseni-Kabir, Anahita, Isele, David, Fujimura, Kikuo
-- In a multi-agent setting, the optimal policy of a single agent is largely dependent on the behavior of other agents. We investigate the problem of multi-agent reinforcement learning, focusing on decentralized learning in non-stationary domains for mobile robot navigation. We identify a cause for the difficulty in training non-stationary policies: mutual adaptation to sub-optimal behaviors, and we use this to motivate a curriculum-based strategy for learning interactive policies. The curriculum has two stages. First, the agent leverages policy gradient algorithms to learn a policy that is capable of achieving multiple goals. Second, the agent learns a modifier policy to learn how to interact with other agents in a multi-agent setting. We evaluated our approach on both an autonomous driving lane-change domain and a robot navigation domain. Single agent reinforcement learning (RL) algorithms have made significant progress in game playing [20] and robotics [13], however, single agent learning algorithms in multi-agent settings are prone to learn stereotyped behaviors that over-fit to the training environment [22], [15]. There are several reasons why multi-agent environments are more difficult: 1) interacting with an unknown agent requires having either multiple responses to a given situation or a more nuanced ability to perceive differences. The former breaks the Markov assumption, the latter rules out simpler solutions which are likely to be found first.
Towards Robot Adaptability in New Situations
Boteanu, Adrian (Worcester Polytechnic Institute) | Kent, David (Worcester Polytechnic Institute) | Mohseni-Kabir, Anahita (Worcester Polytechnic Institute) | Rich, Charles (Worcester Polytechnic Institute) | Chernova, Sonia (Worcester Polytechnic Institute)
We present a system that integrates robot task execution with user input and feedback at multiple abstraction levels in order to achieve greater adaptability in new environments. The user can specify a hierarchical task, with the system interactively proposing logical action groupings within the task. During execution, if tasks fail because objects specified in the initial task description are not found in the environment, the robot proposes substitutions autonomously in order to repair the plan and resume execution. The user can assist the robot by reviewing substitutions. Finally, the user can train the robot to recognize and manipulate novel objects, either during training or during execution. In addition to this single-user scenario, we propose extensions that leverage crowdsourced input to reduce the need for direct user feedback.
Collaborative Learning of Hierarchical Task Networks from Demonstration and Instruction
Mohseni-Kabir, Anahita (Worcester Polytechnic Institute) | Chernova, Sonia (Worcester Polytechnic Institute) | Rich, Charles (Worcester Polytechnic Institute)
In this work, we focus on advancing the state of the art in intelligent agents that can learn complex procedural tasks from humans. Our main innovation is to view the interaction between the human and the robot as a mixed- initiative collaboration. Our contribution is to integrate hierarchical task networks and collaborative discourse theory into the learning from demonstration paradigm to enable robots to learn complex tasks in collaboration with the human teacher.