PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Zheng, Yupeng, Xing, Zebin, Zhang, Qichao, Jin, Bu, Li, Pengfei, Zheng, Yuhang, Xia, Zhongpu, Zhan, Kun, Lang, Xianpeng, Chen, Yaran, Zhao, Dongbin

Jun-4-2024–arXiv.org Artificial Intelligence

Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we propose PlanAgent, the first mid-to-mid planning system based on a Multi-modal Large Language Model (MLLM). MLLM is used as a cognitive agent to introduce human-like knowledge, interpretability, and common-sense reasoning into the closed-loop planning. Specifically, PlanAgent leverages the power of MLLM through three core modules. First, an Environment Transformation module constructs a Bird's Eye View (BEV) map and a lane-graph-based textual description from the environment as inputs. Second, a Reasoning Engine module introduces a hierarchical chain-of-thought from scene understanding to lateral and longitudinal motion instructions, culminating in planner code generation. Last, a Reflection module is integrated to simulate and evaluate the generated planner for reducing MLLM's uncertainty. PlanAgent is endowed with the common-sense reasoning and generalization capability of MLLM, which empowers it to effectively tackle both common and complex long-tailed scenarios. Our proposed PlanAgent is evaluated on the large-scale and challenging nuPlan benchmarks. A comprehensive set of experiments convincingly demonstrates that PlanAgent outperforms the existing state-of-the-art in the closed-loop motion planning task. Codes will be soon released.

information, planagent, scenario, (14 more...)

arXiv.org Artificial Intelligence

Jun-4-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Mexico > Bernalillo County
    - Albuquerque (0.04)
  - Nevada > Clark County
    - Las Vegas (0.04)
- Asia
  - Singapore (0.04)
  - China > Beijing
    - Beijing (0.05)

Genre:
- Research Report (1.00)

Industry:
- Transportation > Ground
  - Road (0.92)
- Energy > Renewable
  - Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Robots > Robot Planning & Action (1.00)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning
    - Commonsense Reasoning (1.00)
    - Agents (1.00)
    - Rule-Based Reasoning (0.90)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found