Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion

Li, Zhuo, Liu, Junjia, Dong, Zhipeng, Teng, Tao, Rouxel, Quentin, Caldwell, Darwin, Chen, Fei

arXiv.org Artificial Intelligence 

However, pre-trained VLA policies still suffer from substantial performance degradation during downstream deployment. Although fine-tuning can mitigate this issue, its reliance on costly demonstration collection and intensive computation makes it impractical in real-world settings. In this work, we introduce VLA-Pilot, a plug-and-play inference-time policy steering method for zero-shot deployment of pre-trained VLA without any additional fine-tuning or data collection. We evaluate VLA-Pilot on six real-world downstream manipulation tasks across two distinct robotic embodiments, encompassing both in-distribution and out-of-distribution scenarios. Experimental results demonstrate that VLA-Pilot substantially boosts the success rates of off-the-shelf pre-trained VLA policies, enabling robust zero-shot generalization to diverse tasks and embodiments. Experimental videos and code are available at: https://rip4kobe.github.io/vla-pilot/. I. INTRODUCTION Recent advances in VLA models have substantially improved the generalization capabilities of robotic manipulation. By learning from large-scale demonstrations [1], these generative foundation policies enable robots to acquire a wide repertoire of skills. At inference time, they can perform diverse and contextually appropriate tasks by stochastically sampling actions from the learned skill distribution.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found