FBI: Learning Dexterous In-hand Manipulation with Dynamic Visuotactile Shortcut Policy

Chen, Yijin, Xu, Wenqiang, Yu, Zhenjun, Tang, Tutian, Li, Yutong, Yao, Siqiong, Lu, Cewu

Aug-21-2025–arXiv.org Artificial Intelligence

Figure 1: We propose Flow Before Imitation (FBI), a novel dynamic visuotactile imitation learning algorithm for dexterous in-hand manipulation. FBI's design enables two operational modes: with or without physical tactile sensors in the real world, largely extending the application scenarios. Abstract -- Dexterous in-hand manipulation is a long-standing challenge in robotics due to complex contact dynamics and partial observability. This paper introduces Flow Before Imitation (FBI), a visuotactile imitation learning framework that dynamically fuses tactile interactions with visual observations through motion dynamics. Unlike prior static fusion methods, FBI establishes a causal link between tactile signals and object motion via a dynamics-aware latent model. FBI employs a transformer-based interaction module to fuse flow-derived tactile features with visual inputs, training a one-step diffusion policy for real-time execution. Extensive experiments demonstrate that the proposed method outperforms the baseline methods in both simulation and the real world on two customized in-hand manipulation tasks and three standard dexterous manipulation tasks.

artificial intelligence, machine learning, manipulation, (16 more...)

arXiv.org Artificial Intelligence

Aug-21-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Shanghai > Shanghai (0.04)
- Europe > Germany
  - Bavaria > Upper Bavaria > Munich (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.34)
  - Robots > Manipulation (0.90)