Goto

Collaborating Authors

 franka


DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control

Wen, Junjie, Zhu, Yichen, Li, Jinming, Tang, Zhibin, Shen, Chaomin, Feng, Feifei

arXiv.org Artificial Intelligence

Enabling robots to perform diverse tasks across varied environments is a central challenge in robot learning. While vision-language-action (VLA) models have shown promise for generalizable robot skills, realizing their full potential requires addressing limitations in action representation and efficient training. Current VLA models often focus on scaling the vision-language model (VLM) component, while the action space representation remains a critical bottleneck. This paper introduces DexVLA, a novel framework designed to enhance the efficiency and generalization capabilities of VLAs for complex, long-horizon tasks across diverse robot embodiments. DexVLA features a novel diffusion-based action expert, scaled to one billion parameters, designed for cross-embodiment learning. A novel embodiment curriculum learning strategy facilitates efficient training: (1) pre-training the diffusion expert that is separable from the VLA on cross-embodiment data, (2) aligning the VLA model to specific embodiments, and (3) post-training for rapid adaptation to new tasks. We conduct comprehensive experiments across multiple embodiments, including single-arm, bimanual, and dexterous hand, demonstrating DexVLA's adaptability to challenging tasks without task-specific adaptation, its ability to learn dexterous skills on novel embodiments with limited data, and its capacity to complete complex, long-horizon tasks using only direct language prompting, such as laundry folding. In all settings, our method demonstrates superior performance compared to state-of-the-art models like Octo, OpenVLA, and Diffusion Policy.


Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting

Chen, Lawrence Yunliang, Hari, Kush, Dharmarajan, Karthik, Xu, Chenfeng, Vuong, Quan, Goldberg, Ken

arXiv.org Artificial Intelligence

The ability to reuse collected data and transfer trained policies between robots could alleviate the burden of additional data collection and training. While existing approaches such as pretraining plus finetuning and co-training show promise, they do not generalize to robots unseen in training. Focusing on common robot arms with similar workspaces and 2-jaw grippers, we investigate the feasibility of zero-shot transfer. Through simulation studies on 8 manipulation tasks, we find that state-based Cartesian control policies can successfully zero-shot transfer to a target robot after accounting for forward dynamics. To address robot visual disparities for vision-based policies, we introduce Mirage, which uses "cross-painting"--masking out the unseen target robot and inpainting the seen source robot--during execution in real time so that it appears to the policy as if the trained source robot were performing the task. Mirage applies to both first-person and third-person camera views and policies that take in both states and images as inputs or only images as inputs. Despite its simplicity, our extensive simulation and physical experiments provide strong evidence that Mirage can successfully zero-shot transfer between different robot arms and grippers with only minimal performance degradation on a variety of manipulation tasks such as picking, stacking, and assembly, significantly outperforming a generalist policy. Project website: https://robot-mirage.github.io/


Franka: A Robot Arm That's Safe, Low Cost, and Can Replicate Itself

#artificialintelligence

Sami Haddadin once attached a knife to a robot manipulator and programmed it to impale his arm. He was demonstrating how a new force-sensing control scheme he designed was able to detect the contact and instantly stop the robot, as it did. Now Haddadin wants to make that same kind of safety feature, which has long been limited to highly sophisticated and expensive systems, affordable to anyone using robots around people. Sometime in 2017, his Munich-based startup, Franka Emika, will start shipping a rather remarkable robotic arm. It's designed to be easy to set up and program, which is nice.


Franka: A Robot Arm That's Safe, Low Cost, and Can Replicate Itself

IEEE Spectrum Robotics

Sami Haddadin once attached a knife to a robot manipulator and programmed it to impale his arm. He was demonstrating how a new force-sensing control scheme he designed was able to detect the contact and instantly stop the robot, as it did. Now Haddadin wants to make that same kind of safety feature, which has long been limited to highly sophisticated and expensive systems, affordable to anyone using robots around people. Sometime in 2017, his Munich-based startup, Franka Emika, will start shipping a rather remarkable robotic arm. It's designed to be easy to set up and program, which is nice.