A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving

Zhang, Yi, Haß, Erik Leo, Chao, Kuo-Yi, Petrovic, Nenad, Song, Yinglei, Wu, Chengdong, Knoll, Alois

Aug-1-2025–arXiv.org Artificial Intelligence

Chair of Robotics, Artificial Intelligence and Embedded Systems T echnical University of Munich (TUM) Munich, Germany {yi1228.zhang, erik-leo.hass, Abstract --Autonomous driving systems face significant challenges in achieving human-like adaptability, robustness, and interpretability in complex, open-world environments. These challenges stem from fragmented architectures, limited generalization to novel scenarios, and insufficient semantic extraction from perception. T o address these limitations, we propose a unified Perception-Language-Action (PLA) framework that integrates multi-sensor fusion (cameras, LiDAR, radar) with a large language model (LLM)-augmented Vision-Language-Action (VLA) architecture, specifically a GPT -4.1-powered reasoning core. This framework unifies low-level sensory processing with high-level contextual reasoning, tightly coupling perception with natural language-based semantic understanding and decision-making to enable context-aware, explainable, and safety-bounded autonomous driving.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Aug-1-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.44)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Information Technology > Robotics & Automation (0.86)
- Automobiles & Trucks (0.86)
- Transportation > Ground
  - Road (0.96)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found