OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning

Lin, Fanqi, Nai, Ruiqian, Hu, Yingdong, You, Jiacheng, Zhao, Junming, Gao, Yang

May-20-2025–arXiv.org Artificial Intelligence

General-purpose robots capable of performing diverse tasks require synergistic reasoning and acting capabilities. However, recent dual-system approaches, which separate high-level reasoning from low-level acting, often suffer from challenges such as limited mutual understanding of capabilities between systems and latency issues. This paper introduces OneTwoVLA, a single unified vision-language-action model that can perform both acting (System One) and reasoning (System Two). Crucially, OneTwoVLA adaptively switches between two modes: explicitly reasoning at critical moments during task execution, and generating actions based on the most recent reasoning at other times. To further unlock OneTwoVLA's reasoning and generalization capabilities, we design a scalable pipeline for synthesizing embodied reasoning-centric vision-language data, used for co-training with robot data. We validate OneTwoVLA's effectiveness through extensive experiments, highlighting its superior performance across four key capabilities: long-horizon task planning, error detection and recovery, natural human-robot interaction, and generalizable visual grounding, enabling the model to perform long-horizon, highly dexterous manipulation tasks such as making hotpot or mixing cocktails.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

May-20-2025

arXiv.org PDF

Add feedback

Country:
- North America (0.28)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.93)
- Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Natural Language > Large Language Model (1.00)
  - Cognitive Science > Problem Solving (0.88)
  - Machine Learning > Neural Networks
    - Deep Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found