RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

Mar-22-2026, 12:45:13 GMT–Neural Information Processing Systems

We propose a method for metric-scale monocular depth estimation. Inferring depth from a single image is an ill-posed problem due to the loss of scale from perspective projection during the image formation process. Any scale chosen is a bias, typically stemming from training on a dataset; hence, existing works have instead opted to use relative (normalized, inverse) depth. Our goal is to recover metric-scaled depth maps through a linear transformation. The crux of our method lies in the observation that certain objects (e.g., cars, trees, street signs) are typically found or associated with certain types of scenes (e.g., outdoor).

artificial intelligence, name change, proceedings, (7 more...)

Neural Information Processing Systems

Mar-22-2026, 12:45:13 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.55)