V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving

Luo, Xuewen, Yang, Fengze, Ding, Fan, Gao, Xiangbo, Xing, Shuo, Zhou, Yang, Tu, Zhengzhong, Liu, Chenxi

Oct-6-2025–arXiv.org Artificial Intelligence

Autonomous driving (AD) has achieved significant progress, yet single-vehicle perception remains constrained by sensing range and occlusions. Vehicle-to-Everything (V2X) communication addresses these limits by enabling collaboration across vehicles and infrastructure, but it also faces heterogeneity, synchronization, and latency constraints. Language models offer strong knowledge-driven reasoning and decision-making capabilities, but they are not inherently designed to process raw sensor streams and are prone to hallucination. We propose V2X-UniPool, the first framework that unifies V2X perception with language-based reasoning for knowledge-driven AD. It transforms multimodal V2X data into structured, language-based knowledge, organizes it in a time-indexed knowledge pool for temporally consistent reasoning, and employs Retrieval-Augmented Generation (RAG) to ground decisions in real-time context. Experiments on the real-world DAIR-V2X dataset show that V2X-UniPool achieves state-of-the-art planning accuracy and safety while reducing communication cost by more than 80\%, achieving the lowest overhead among evaluated methods. These results highlight the promise of bridging V2X perception and language reasoning to advance scalable and trustworthy driving. Our code is available at: https://github.com/Xuewen2025/V2X-UniPool

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-6-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas (0.14)

Genre:
- Research Report (1.00)

Industry:
- Information Technology (1.00)
- Transportation > Ground
  - Road (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Robots > Autonomous Vehicles (1.00)
  - Representation & Reasoning (1.00)
  - Cognitive Science > Problem Solving (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.89)
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found