geometric cue
SupplementaryMaterialfor MonoSDF: ExploringMonocularGeometricCues forNeuralImplicitSurfaceReconstruction
In this section, we first present an overview of 4 different architectures for neural implicit scene representations anddetails ofMulti-Res. See Figure 1 for an overview over the architectures. More specifically, each grid contains up toT feature vectors with dimensionalityF. We further reportNormal Consistencyfor the Replica dataset following [9,13,18,19,23,32] as near-perfect ground truth is available. We observe that using more input views for training improves reconstruction quality.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Supplementary Material for MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction Zehao Y u
Grids in Section 1.1 and provide details of the depth loss In the following, we provide details for Multi-Res. For our single MLP architecture, we use an 8-layer MLP with hidden dimension 256. We use a two-layer MLP with hidden dimension 256 for the SDF prediction for both, Single-Res. For the DTU dataset [1], we follow the official evaluation protocol and report the reconstruction quality with: Accuracy, Completeness and Chamfer Distance . Distance is the mean of Accuracy and Completeness .
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Czechia > Prague (0.04)
GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
Huang, Haiwen, Geiger, Andreas, Zhang, Dan
We address the task of open-world class-agnostic object detection, i.e., detecting every object in an image by learning from a limited number of base object classes. State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects. This is because RGB-based models primarily rely on appearance similarity to detect novel objects and are also prone to overfitting short-cut cues such as textures and discriminative parts. To address these shortcomings of RGB-based object detectors, we propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators. Specifically, we use the geometric cues to train an object proposal network for pseudo-labeling unannotated novel objects in the training set. Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes. Using a single "person" class for training on the COCO dataset, GOOD surpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.