M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

AI, Inclusion, :, null, Wang, Fudong, Liu, Jiajia, Chen, Jingdong, Zhou, Jun, Ji, Kaixiang, Ru, Lixiang, Guo, Qingpei, Zheng, Ruobing, Li, Tianqi, Yuan, Yi, Mao, Yifan, Xiao, Yuting, Ma, Ziping

Jul-14-2025–arXiv.org Artificial Intelligence

Recent advancements in Multimodal Large Language Models (MLLMs), particularly through Reinforcement Learning with Verifiable Rewards (RLVR), have significantly enhanced their reasoning abilities. However, a critical gap persists: these models struggle with dynamic spatial interactions, a capability essential for real-world applications. To bridge this gap, we introduce M2-Reasoning-7B, a model designed to excel in both general and spatial reasoning. Our approach integrates two key innovations: (1) a novel data pipeline that generates 294.2K high-quality data samples (168K for cold-start fine-tuning and 126.2K for RLVR), which feature logically coherent reasoning trajectories and have undergone comprehensive assessment; and (2) a dynamic multi-task training strategy with step-wise optimization to mitigate conflicts between data, and task-specific rewards for delivering tailored incentive signals. This combination of curated data and advanced training allows M2-Reasoning-7B to set a new state-of-the-art (SOTA) across 8 benchmarks, showcasing superior performance in both general and spatial reasoning domains.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-14-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Spatial Reasoning (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.94)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found