Curran, William
POMDPs for Risk-Aware Autonomy
Curran, William (Oregon State University) | Bowie, Cameron (Oregon State University) | Smart, William D. (Oregon State University)
Although we would like our robots to have completely autonomous behavior, this is often not possible. Some parts of a task might be hard to automate, perhaps due to hard-to-interpret sensor information, or a complex environment. In this case, using shared autonomy or teleoperation is preferable to an error-prone autonomous approach. However, the question of which parts of a task to allocate to the human, and which to the robot can often be tricky. In this work, we introduce A 3 P, a risk-aware task-level reinforcement learning algorithm. A 3 P represents a task-level state machine as a POMDP. In this paper, we introduce A 3 P, a risk-aware algorithm that discovers when to hand off subtasks to a human assistant. A 3 P models the task as a Partially Observably Markov Decision Process (POMDP) and explicitly represents failures as additional state-action pairs. Based on the model, the algorithm allows the user to allocate subtasks the robot or the human in such a way as to manage the worst-case performance time for the overall task.
Dimensionality Reduced Reinforcement Learning for Assistive Robots
Curran, William (Oregon State University) | Brys, Tim (Vrije Universiteit Brussel) | Aha, David (Navy Center for Applied Research in AI) | Taylor, Matthew (Washington State University) | Smart, William D. (Oregon State University)
State-of-the-art personal robots need to perform complex manipulation tasks to be viable in assistive scenarios. However, many of these robots, like the PR2, use manipulators with high degrees-of-freedom, and the problem is made worse in bimanual manipulation tasks. The complexity of these robots lead to large dimensional state spaces, which are difficult to learn in. We reduce the state space by using demonstrations to discover a representative low-dimensional hyperplane in which to learn. This allows the agent to converge quickly to a good policy. We call this Dimensionality Reduced Reinforcement Learning (DRRL). However, when performing dimensionality reduction, not all dimensions can be fully represented. We extend this work by first learning in a single dimension, and then transferring that knowledge to a higher-dimensional hyperplane. By using our Iterative DRRL (IDRRL) framework with an existing learning algorithm, the agent converges quickly to a better policy by iterating to increasingly higher dimensions. IDRRL is robust to demonstration quality and can learn efficiently using few demonstrations. We show that adding IDRRL to the Q-Learning algorithm leads to faster learning on a set of mountain car tasks and the robot swimmers problem.
Robust Learning from Demonstration Techniques and Tools
Curran, William (Oregon State University)
Large state spaces and the curse of dimensionality contribute to the complexity of a task. Learning from demonstration techniques can be combined with reinforcement learning to narrow the exploration space of an agent, but require consistent and accurate demonstrations, as well as the state-action pairs for an entire demonstration. Individuals with severe motor disabilities are often slow and prone to human errors in demonstrations while teaching. My dissertation develops tools to allow persons with severe motor disabilities, and individuals in general, to train these systems. To handle these large state spaces as well as human error, we developed Dimensionality Reduced Reinforcement Learning. To accommodate slower feedback, we will develop a movie-reel style learning from demonstration interface.