Visual Reasoning at Urban Intersections: FineTuning GPT-4o for Traffic Conflict Detection
Masri, Sari, Ashqar, Huthaifa I., Elhenawy, Mohammed
–arXiv.org Artificial Intelligence
-- Traffic control in unsignalized urban intersections presents significant challenges due to the complexity, frequent conflicts, and blind spots. This study explores the capability of leveraging Multimodal L arge L anguage M odel s (MLLMs), such as GPT - 4o, to provide logical and visual reasoning by directly using birds - eye - view videos of four - legged intersections. In this proposed method, GPT - 4o act s as intelligent system to detect conflicts and provide explanations and recommendations for the drivers . The fine - tuned model achieved an accuracy of 77.14%, while the manual evaluation of the true predicted values of the fine - tuned GPT - 4o showed significant achievements of 89.9% accuracy for model - generated explanations and 92.3% for the recommended next a ctions. Urban intersections are highly challenging due to their unpredictability and dynamism, especially in cases of unsignalized intersections. Interactions often occur among motor vehicles and other road users in such areas.
arXiv.org Artificial Intelligence
Feb-27-2025
- Country:
- Oceania > Australia > Queensland (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Automobiles & Trucks (0.68)
- Transportation > Ground
- Road (0.70)
- Technology: