DepthMamba with Adaptive Fusion

Dec-27-2024–arXiv.org Artificial Intelligence

Multi-view depth estimation has achieved impressive performance over various benchmarks. However, almost all current multi-view systems rely on given ideal camera poses, which are unavailable in many real-world scenarios, such as autonomous driving. In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings. Surprisingly, we find current multi-view depth estimation methods or single-view and multi-view fusion methods will fail when given noisy pose settings. To tackle this challenge, we propose a two-branch network architecture which fuses the depth estimation results of single-view and multi-view branch. In specific, we introduced mamba to serve as feature extraction backbone and propose an attention-based fusion methods which adaptively select the most robust estimation results between the two branches. Thus, the proposed method can perform well on some challenging scenes including dynamic objects, texture-less regions, etc. Ablation studies prove the effectiveness of the backbone and fusion method, while evaluation experiments on challenging benchmarks (KITTI and DDAD) show that the proposed method achieves a competitive performance compared to the state-of-the-art methods.

artificial intelligence, image understanding, machine learning, (21 more...)

arXiv.org Artificial Intelligence

Dec-27-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > Promising Solution (0.48)

Industry:
- Information Technology (0.35)
- Automobiles & Trucks (0.35)
- Transportation > Ground
  - Road (0.35)

Technology:
- Information Technology > Artificial Intelligence
  - Vision > Image Understanding (1.00)
  - Representation & Reasoning > Information Fusion (0.76)
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found