Video Depth Estimation ModelCover FigureMerge360!imageto video

Jun-16-2026, 22:41:30 GMT–Neural Information Processing Systems

To mitigate the distortions brought by equirectangular projection, existing methods typically divide 360 images into distortion-less perspective patches. However, since these patches are processed independently, depth inconsistencies are often introduced due to scale drift among patches. Recently, video depth estimation (VDE) models have leveraged temporal consistency for stable depth predictions across frames. Inspired by this, we propose to represent a 360 image as a sequence of perspective frames, mimicking the viewpoint adjustments users make when exploring a 360 scenario in virtual reality. Thus, the spatial consistency among perspective depth patches can be enhanced by exploiting the temporal consistency inherent in VDE models. To this end, we introduce a training-free pipeline for 360 monocular depth estimation, called ST2360D.

artificial intelligence, depth estimation, image understanding, (16 more...)

Neural Information Processing Systems

Jun-16-2026, 22:41:30 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.46)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Technology:
- Information Technology
  - Human Computer Interaction > Interfaces
    - Virtual Reality (0.66)
  - Artificial Intelligence > Vision
    - Image Understanding (0.84)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found