Curran

Feb-8-2022, 10:17:23 GMT–AAAI Conferences

Although we would like our robots to have completely autonomous behavior, this is often not possible. Some parts of a task might be hard to automate, perhaps due to hard-to-interpret sensor information, or a complex environment. In this case, using shared autonomy or teleoperation is preferable to an error-prone autonomous approach. However, the question of which parts of a task to allocate to the human, and which to the robot can often be tricky. In this work, we introduce A3P, a risk-aware task-level reinforcement learning algorithm. A3P represents a task-level state machine as a POMDP. In this paper, we introduce A3P, a risk-aware algorithm that discovers when to hand off subtasks to a human assistant. A3P models the task as a Partially Observably Markov Decision Process (POMDP) and explicitly represents failures as additional state-action pairs. Based on the model, the algorithm allows the user to allocate subtasks the robot or the human in such a way as to manage the worst-case performance time for the overall task.

algorithm, machine learning, reinforcement learning, (5 more...)

AAAI Conferences

Feb-8-2022, 10:17:23 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.90)
  - Reinforcement Learning (0.65)