An Initial Study of Bird's-Eye View Generation for Autonomous Vehicles using Cross-View Transformers

Santos, Felipe Carlos dos, Antonelo, Eric Aislan, Couto, Gustavo Claudio Karl

Aug-19-2025–arXiv.org Artificial Intelligence

Bird's-Eye View (BEV) maps provide a structured, top-down abstraction that is crucial for autonomous-driving perception. In this work, we employ Cross-View Transformers (CVT) for learning to map camera images to three BEV's channels - road, lane markings, and planned trajectory - using a realistic simulator for urban driving. Our study examines generalization to unseen towns, the effect of different camera layouts, and two loss formulations (focal and L1). Using training data from only a town, a four-camera CVT trained with the L1 loss delivers the most robust test performance, evaluated in a new town. Overall, our results underscore CVT's promise for mapping camera inputs to reasonably accurate BEV maps.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Aug-19-2025

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.46)

Genre:
- Research Report > New Finding (0.86)

Industry:
- Information Technology (0.37)
- Transportation > Ground
  - Road (0.49)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Natural Language (0.94)
    - Robots > Autonomous Vehicles (0.71)
    - Machine Learning > Neural Networks
      - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found