SpatialReasoner: Towards Explicit and Generalizable 3DSpatial Reasoning

Jun-22-2026, 19:34:51 GMT–Neural Information Processing Systems

Despite recent advances on multi-modal models, 3D spatial reasoning remains a challenging task for state-of-the-art open-source and proprietary models. Recent studies explore data-driven approaches and achieve enhanced spatial reasoning performance by fine-tuning models on 3D-related visual question-answering data. However, these methods typically perform spatial reasoning in an implicit manner and often fail on questions that are trivial to humans, even with long chain-ofthought reasoning. In this work, we introduce SpatialReasoner, a novel large visionlanguage model (LVLM) that addresses 3D spatial reasoning with explicit 3D representations shared between multiple stages-3D perception, computation, and reasoning. Explicit 3D representations provide a coherent interface that supports advanced 3D spatial reasoning and improves the generalization ability to novel question types. Furthermore, by analyzing the explicit 3D representations in multistep reasoning traces of SpatialReasoner, we study the factual errors and identify key shortcomings of current LVLMs. Results show that our SpatialReasoner achieves improved performance on a variety of spatial reasoning benchmarks, outperforming Gemini 2.0 by 9.2% on 3DSRBench, and generalizes better when evaluating on novel 3D spatial reasoning questions.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Jun-22-2026, 19:34:51 GMT

Conferences PDF

Add feedback

Country:
- North America (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Spatial Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found