Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance
Gano, Selam, George, Abraham, Farimani, Amir Barati
–arXiv.org Artificial Intelligence
Translating advances in visual perception to robotic grasping and manipulation of objects remains challenging. For complex manipulation tasks such as peg insertion, pulling or twisting with resistance, and dynamic motions such as throwing and catching, fine-grained manipulation requires tactile perception. Tactile sensors have been paired with visual sensors for both classical control and machine learning approaches to these tasks [1], but issues of fragility and cost present barriers to heavy use or industrial integration, particularly for manipulation tasks that would place higher forces on sensors at the tactile edge. Previously, a GelSight [2] tactile sensor was used to train an agent on a USB insertion task [3], the first time this was achieved with imitation learning. GelSight is not designed for robustness to higher shear forces and was noted to break irrecoverably during data collection and inference for that task, requiring repeated replacement. This work also demonstrated an approach using tactile information only during pretraining, then ablating the tactile sensor at inference, achieving a more robust vision-only manipulation system. BeadSight [4] aimed to make a simpler, low cost calibration-free sensor that, like GelSight, still operated at an end effector's point of contact with objects. We constructed the BeadSight sensor, which does not rely on any calibration and instead relies entirely on neural networks to distill information about contacts and movements at the tactile edge. In this work, we repeated the visuo-tactile pretraining USB plugging experiment using the much lower fidelity BeadSight to produce a direct comparison with the GelSight sensor in the task of plugging in a USB cable.
arXiv.org Artificial Intelligence
Jun-25-2024
- Country:
- Genre:
- Research Report (1.00)
- Technology: