6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction

Gieruc, Théo, Kästingschäfer, Marius, Bernhard, Sebastian, Salzmann, Mathieu

Apr-18-2024–arXiv.org Artificial Intelligence

Current 3D reconstruction techniques struggle to infer unbounded scenes from a few images faithfully. Specifically, existing methods have high computational demands, require detailed pose information, and cannot reconstruct occluded regions reliably. We introduce 6Img-to-3D, an efficient, scalable transformer-based encoder-renderer method for single-shot image to 3D reconstruction. Our method outputs a 3D-consistent parameterized triplane from only six outward-facing input images for large-scale, unbounded outdoor driving scenarios. We take a step towards resolving existing shortcomings by combining contracted custom cross- and self-attention mechanisms for triplane parameterization, differentiable volume rendering, scene contraction, and image feature projection. We showcase that six surround-view vehicle images from a single timestamp without global pose information are enough to reconstruct 360$^{\circ}$ scenes during inference time, taking 395 ms. Our method allows, for example, rendering third-person images and birds-eye views. Our code is available at https://github.com/continental/6Img-to-3D, and more examples can be found at our website here https://6Img-to-3D.GitHub.io/.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Apr-18-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Japan > Honshū
    - Chūbu
      - Ishikawa Prefecture > Kanazawa (0.05)
      - Nagano Prefecture > Nagano (0.04)
  - Middle East > Palestine
    - West Bank > Bethlehem Governorate (0.04)
- Europe > Germany (0.04)
- North America
  - Canada > Quebec
    - Montreal (0.04)
  - United States
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Tennessee > Davidson County
      - Nashville (0.04)

Genre:
- Research Report (0.43)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.48)
    - Natural Language (0.88)
    - Representation & Reasoning (0.93)
    - Robots (1.00)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found