DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
Song, Jingyu, Li, Zhenxin, Lan, Shiyi, Sun, Xinglong, Chang, Nadine, Shen, Maying, Chen, Joshua, Skinner, Katherine A., Alvarez, Jose M.
–arXiv.org Artificial Intelligence
Abstract-- Benchmarking autonomous driving planners to align with human judgment remains a critical challenge, as state-of-the-art metrics like the Extended Predictive Driver Model Score (EPDMS) lack context awareness in nuanced scenarios. T o address this, we introduce DriveCritic, a novel framework featuring two key contributions: the DriveCritic dataset, a curated collection of challenging scenarios where context is critical for correct judgment and annotated with pairwise human preferences, and the DriveCritic model, a Vision-Language Model (VLM) based evaluator . Fine-tuned using a two-stage supervised and reinforcement learning pipeline, the DriveCritic model learns to adjudicate between trajectory pairs by integrating visual and symbolic context. Experiments show DriveCritic significantly outperforms existing metrics and baselines in matching human preferences and demonstrates strong context awareness. Overall, our work provides a more reliable, human-aligned foundation to evaluating autonomous driving systems. Planning is one of the central components to enable autonomous driving, as it is expected to predict safe and efficient future trajectories for the autonomous vehicle to follow [1], [2]. Recently, end-to-end (E2E) driving systems that are trained with planning-oriented goals have advanced at a fast pace and demonstrated superior performance in large-scale benchmarks [3]-[7]. However, benchmarking planners in a way that accurately reflects safety and human expectations still remains challenging [8], [9].
arXiv.org Artificial Intelligence
Oct-16-2025
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks (1.00)
- Transportation > Ground
- Road (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Robots > Autonomous Vehicles (1.00)
- Representation & Reasoning (1.00)
- Natural Language (1.00)
- Cognitive Science > Problem Solving (1.00)
- Machine Learning > Neural Networks
- Deep Learning (0.47)
- Information Technology > Artificial Intelligence