depth uncertainty
Depth Uncertainty in Neural Networks
Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding model uncertainty. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass.
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images
Menon, Rohit, Dengler, Nils, Pan, Sicong, Chenchani, Gokul Krishna, Bennewitz, Maren
For scene understanding in unstructured environments, an accurate and uncertainty-aware metric-semantic mapping is required to enable informed action selection by autonomous systems. Existing mapping methods often suffer from overconfident semantic predictions, and sparse and noisy depth sensing, leading to inconsistent map representations. In this paper, we therefore introduce EvidMTL, a multi-task learning framework that uses evidential heads for depth estimation and semantic segmentation, enabling uncertainty-aware inference from monocular RGB images. To enable uncertainty-calibrated evidential multi-task learning, we propose a novel evidential depth loss function that jointly optimizes the belief strength of the depth prediction in conjunction with evidential segmentation loss. Building on this, we present EvidKimera, an uncertainty-aware semantic surface mapping framework, which uses evidential depth and semantics prediction for improved 3D metric-semantic consistency. We train and evaluate EvidMTL on the NYUDepthV2 and assess its zero-shot performance on ScanNetV2, demonstrating superior uncertainty estimation compared to conventional approaches while maintaining comparable depth estimation and semantic segmentation. In zero-shot mapping tests on ScanNetV2, EvidKimera outperforms Kimera in semantic surface mapping accuracy and consistency, highlighting the benefits of uncertainty-aware mapping and underscoring its potential for real-world robotic applications.
Review for NeurIPS paper: Depth Uncertainty in Neural Networks
Weaknesses: My main concern is mainly about the experiments section. It is great to see the demonstration of the method on ResNet-50. But I couldn't find the comparison of test accuracy through the paper. There is only log-likelihood included in the appendix. I would expect a comparison on test accuracy among DUN, MC-dropout, deep ensembles, deep ensembles with different depths.
Review for NeurIPS paper: Depth Uncertainty in Neural Networks
This paper proposes to treat depth of the network as a random variable and marginalize over that to achieve better uncertainty. The authors show that this can be performed efficiently in a single forward pass, and demonstrate improved uncertainty estimates on both regression and classification benchmarks (including corrupted versions and out-of-distribution evaluation). The reviewers initially raised several questions particularly on experimental setup, flexibility of the depth posterior and comparisons to stochastic depth and dropout). During the discussion, the reviewers agreed that the author rebuttal satisfactorily addresses the major concerns and some of them raised their scores correspondingly as well. Overall, this is a good paper and I recommend accept.
Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
Strong, Matthew, Lei, Boshu, Swann, Aiden, Jiang, Wen, Daniilidis, Kostas, Kennedy, Monroe III
We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit 3D scene representation for robotics, as it has the ability to represent scenes in a both photorealistic and geometrically accurate manner. However, in real-world, online robotic scenes where the number of views is limited given efficiency requirements, random view selection for 3DGS becomes impractical as views are often overlapping and redundant. We address this issue by proposing an end-to-end online training and active view selection pipeline, which enhances the performance of 3DGS in few-view robotics settings. We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method using Segment Anything Model 2 (SAM2) that we supplement with Pearson depth and surface normal loss to improve color and depth reconstruction of real-world scenes. We then extend FisherRF, a next-best-view selection method for 3DGS, to select views and touch poses based on depth uncertainty. We perform online view selection on a real robot system during live 3DGS training. We motivate our improvements to few-shot GS scenes, and extend depth-based FisherRF to them, where we demonstrate both qualitative and quantitative improvements on challenging robot scenes. For more information, please see our project page at https://arm.stanford.edu/next-best-sense.
- North America > United States > California > Santa Clara County > Palo Alto (0.24)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Depth Uncertainty in Neural Networks
Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding model uncertainty. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass. Our approach provides uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.
GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection
Lu, Yan, Ma, Xinzhu, Yang, Lei, Zhang, Tianzhu, Liu, Yating, Chu, Qi, He, Tong, Li, Yonghui, Ouyang, Wanli
Geometry plays a significant role in monocular 3D object detection. It can be used to estimate object depth by using the perspective projection between object's physical size and 2D projection in the image plane, which can introduce mathematical priors into deep models. However, this projection process also introduces error amplification, where the error of the estimated height is amplified and reflected into the projected depth. It leads to unreliable depth inferences and also impairs training stability. To tackle this problem, we propose a novel Geometry Uncertainty Propagation Network (GUPNet++) by modeling geometry projection in a probabilistic manner. This ensures depth predictions are well-bounded and associated with a reasonable uncertainty. The significance of introducing such geometric uncertainty is two-fold: (1). It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning. (2). It can be derived to a highly reliable confidence to indicate the quality of the 3D detection result, enabling more reliable detection inference. Experiments show that the proposed approach not only obtains (state-of-the-art) SOTA performance in image-based monocular 3D detection but also demonstrates superiority in efficacy with a simplified framework.
- South America > Brazil (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (2 more...)
Towards Open World NeRF-Based SLAM
Lisus, Daniil, Holmes, Connor, Waslander, Steven
Neural Radiance Fields (NeRFs) offer versatility and robustness in map representations for Simultaneous Localization and Mapping (SLAM) tasks. This paper extends NICE-SLAM, a recent state-of-the-art NeRF-based SLAM algorithm capable of producing high quality NeRF maps. However, depending on the hardware used, the required number of iterations to produce these maps often makes NICE-SLAM run at less than real time. Additionally, the estimated trajectories fail to be competitive with classical SLAM approaches. Finally, NICE-SLAM requires a grid covering the considered environment to be defined prior to runtime, making it difficult to extend into previously unseen scenes. This paper seeks to make NICE-SLAM more open-world-capable by improving the robustness and tracking accuracy, and generalizing the map representation to handle unconstrained environments. This is done by improving measurement uncertainty handling, incorporating motion information, and modelling the map as having an explicit foreground and background. It is shown that these changes are able to improve tracking accuracy by 85% to 97% depending on the available resources, while also improving mapping in environments with visual information extending outside of the predefined grid.
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > Quebec > Montreal (0.04)
A technique to jointly estimate depth and depth uncertainty for unmanned aerial vehicles
Fonder, Michaël, Van Droogenbroeck, Marc
When used by autonomous vehicles for trajectory planning or obstacle avoidance, depth estimation methods need to be reliable. Therefore, estimating the quality of the depth outputs is critical. In this paper, we show how M4Depth, a state-of-the-art depth estimation method designed for unmanned aerial vehicle (UAV) applications, can be enhanced to perform joint depth and uncertainty estimation. For that, we present a solution to convert the uncertainty estimates related to parallax generated by M4Depth into uncertainty estimates related to depth, and show that it outperforms the standard probabilistic approach. Our experiments on various public datasets demonstrate that our method performs consistently, even in zero-shot transfer. Besides, our method offers a compelling value when compared to existing multi-view depth estimation methods as it performs similarly on a multi-view depth estimation benchmark despite being 2.5 times faster and causal, as opposed to other methods. The code of our method is publicly available at https://github.com/michael-fonder/M4DepthU .
- North America > United States > California > Los Angeles County > Long Beach (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (5 more...)
- Information Technology > Robotics & Automation (0.61)
- Aerospace & Defense > Aircraft (0.61)
Volumetric Occupancy Mapping With Probabilistic Depth Completion for Robotic Navigation
Popovic, Marija, Thomas, Florian, Papatheodorou, Sotiris, Funk, Nils, Vidal-Calleja, Teresa, Leutenegger, Stefan
In robotic applications, a key requirement for safe and efficient motion planning is the ability to map obstacle-free space in unknown, cluttered 3D environments. However, commodity-grade RGB-D cameras commonly used for sensing fail to register valid depth values on shiny, glossy, bright, or distant surfaces, leading to missing data in the map. To address this issue, we propose a framework leveraging probabilistic depth completion as an additional input for spatial mapping. We introduce a deep learning architecture providing uncertainty estimates for the depth completion of RGB-D images. Our pipeline exploits the inferred missing depth values and depth uncertainty to complement raw depth images and improve the speed and quality of free space mapping. Evaluations on synthetic data show that our approach maps significantly more correct free space with relatively low error when compared against using raw data alone in different indoor environments; thereby producing more complete maps that can be directly used for robotic navigation tasks. The performance of our framework is validated using real-world data.
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)