SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality

Lai, Yuzhi, Yuan, Shenghai, Li, Peizheng, Lou, Jun, Zell, Andreas

Aug-26-2025–arXiv.org Artificial Intelligence

Unlike existing systems that assume static or single-view settings, SEER-V AR dynamically separates cabin and road scenes via depth-guided vision-language grounding. Two SLAM branches track egocentric motion in each context, while a GPT -based module generates context-aware overlays such as dashboard cues and hazard alerts. To support evaluation, we introduce EgoSLAM-Drive, a real-world dataset featuring synchronized egocentric views, 6DoF ground-truth poses, and AR annotations across diverse driving scenarios. Experiments demonstrate that SEER-V AR achieves robust spatial alignment and perceptually coherent AR rendering across varied environments. As one of the first to explore LLM-based AR recommendation in egocentric driving, we address the lack of comparable systems through structured prompting and detailed user studies. Results show that SEER-V AR enhances perceived scene understanding, overlay relevance, and driver ease, providing an effective foundation for future research in this direction. Code and dataset will be made open source.

large language model, machine learning, overlay, (21 more...)

arXiv.org Artificial Intelligence

Aug-26-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.88)

Industry:
- Automobiles & Trucks (0.93)
- Information Technology > Security & Privacy (1.00)
- Law (0.67)
- Transportation > Ground
  - Road (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Large Language Model (1.00)
    - Representation & Reasoning (1.00)
    - Robots (1.00)
    - Vision (1.00)
  - Human Computer Interaction > Interfaces (0.83)
  - Security & Privacy (1.00)