DSPv2: Improved Dense Policy for Effective and Generalizable Whole-body Mobile Manipulation

Su, Yue, Zhang, Chubin, Chen, Sijin, Tan, Liufan, Tang, Yansong, Wang, Jianan, Liu, Xihui

Sep-29-2025–arXiv.org Artificial Intelligence

Figure 1: DSPv2 is a whole-body mobile manipulation policy that achieves generalizable performance by fusing multi-view 2D semantic perception with 3D spatial awareness, and generates coherent whole-body actions via dense action head. Abstract -- Learning whole-body mobile manipulation via imitation is essential for generalizing robotic skills to diverse environments and complex tasks. However, this goal is hindered by significant challenges, particularly in effectively processing complex observation, achieving robust generalization, and generating coherent actions. T o address these issues, we propose DSPv2, a novel policy architecture. DSPv2 introduces an effective encoding scheme that aligns 3D spatial features with multi-view 2D semantic features. This fusion enables the policy to achieve broad generalization while retaining the fine-grained perception necessary for precise control. Furthermore, we extend the Dense Policy paradigm to the whole-body mobile manipulation domain, demonstrating its effectiveness in generating coherent and precise actions for the whole-body robotic platform. Extensive experiments show that our method significantly outperforms existing approaches in both task performance and generalization ability. Project page is available at: https://selen-suyue.github.io/DSPv2Net/ .

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Sep-29-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.85)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Natural Language (1.00)
  - Machine Learning (1.00)
  - Representation & Reasoning > Spatial Reasoning (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found