Observe Then Act: Asynchronous Active Vision-Action Model for Robotic Manipulation

Wang, Guokang, Li, Hang, Zhang, Shuyuan, Liu, Yanhong, Liu, Huaping

Oct-1-2024–arXiv.org Artificial Intelligence

In real-world scenarios, many robotic manipulation tasks are hindered by occlusions and limited fields of view, posing significant challenges for passive observation-based models that rely on fixed or wrist-mounted cameras. In this paper, we investigate the problem of robotic manipulation under limited visual observation and propose a task-driven asynchronous active vision-action model.Our model serially connects a camera Next-Best-View (NBV) policy with a gripper Next-Best Pose (NBP) policy, and trains them in a sensor-motor coordination framework using few-shot reinforcement learning. This approach allows the agent to adjust a third-person camera to actively observe the environment based on the task goal, and subsequently infer the appropriate manipulation actions.We trained and evaluated our model on 8 viewpoint-constrained tasks in RLBench. The results demonstrate that our model consistently outperforms baseline algorithms, showcasing its effectiveness in handling visual constraints in manipulation tasks.

manipulation, robotic manipulation, viewpoint, (11 more...)

arXiv.org Artificial Intelligence

Oct-1-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > New Zealand
  - North Island > Auckland Region > Auckland (0.04)
- North America
  - United States
    - Maryland > Baltimore (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - California > Los Angeles County
      - Long Beach (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Switzerland > Zürich
    - Zürich (0.14)
  - Greece > Attica
    - Athens (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)
- Asia
  - Singapore (0.04)
  - Macao (0.04)
  - South Korea > Daegu
    - Daegu (0.04)
  - Japan > Honshū
    - Kantō > Kanagawa Prefecture > Yokohama (0.04)
  - China
    - Beijing > Beijing (0.04)
    - Henan Province > Zhengzhou (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Health & Medicine (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning > Reinforcement Learning (0.35)