Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision
Hoque, Ryan, Chen, Lawrence Yunliang, Sharma, Satvik, Dharmarajan, Karthik, Thananjeyan, Brijen, Abbeel, Pieter, Goldberg, Ken
–arXiv.org Artificial Intelligence
Amazon, Nimble, Plus One, Waymo, and Zoox use remote human supervision of robot fleets in applications ranging from self-driving taxis to automated warehouse fulfillment [1, 2, 3, 4, 5]. These robots intermittently cede control during task execution to remote human supervisors for corrective interventions. The interventions take place either during learning, when they are used to improve the robot policy, or during execution, when the policy is no longer updated but robots can still request human assistance when needed to improve reliability. In the continual learning setting, these occur simultaneously: the robot policy has been deployed but continues to be updated indefinitely with additional intervention data. Furthermore, any individual robot can share its intervention data with the rest of the fleet. As opposed to robot swarms that must coordinate with each other to achieve a common objective, a robot fleet is a set of independent robots simultaneously executing the same control policy in parallel environments. We refer to the setting of a robot fleet learning via interactive requests for human supervision (see Figure 1) as Interactive Fleet Learning (IFL). Of central importance in IFL is the supervisor allocation problem: how should limited human supervision be allocated to robots in a manner that maximizes the throughput of the fleet?
arXiv.org Artificial Intelligence
Nov-16-2022
- Country:
- North America > United States > California (0.28)
- Genre:
- Research Report (1.00)
- Industry:
- Education > Educational Setting > Online (0.47)
- Technology: