Goto

Collaborating Authors

 sdf-srn


83fa5a432ae55c253d0e60dbfa716723-Paper.pdf

Neural Information Processing Systems

Research efforts on learning implicit 3D shapes without 3D supervision have primarily resorted to binary occupancy[26,34]asthe representation, aiming tomatch reprojected 3D occupancytothe given binary masks. Current worksadopting signed distance functions (SDF) either require apretrained deep shape prior [27] or are limited to discretized representations [14] that do not scale up with resolution.


SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Neural Information Processing Systems

Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets. Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes, dramatically reducing the cost and effort of annotation. These techniques, however, remain impractical as they still require multi-view annotations of the same object instance during training. As a result, most experimental efforts to date have been limited to synthetic datasets. In this paper, we address this issue and propose SDF-SRN, an approach that requires only a single view of objects at training time, offering greater utility for real-world scenarios. SDF-SRN learns implicit 3D shape representations to handle arbitrary shape topologies that may exist in the datasets. To this end, we derive a novel differentiable rendering formulation for learning signed distance functions (SDF) from 2D silhouettes. Our method outperforms the state of the art under challenging single-view supervision settings on both synthetic and real-world datasets.



Review for NeurIPS paper: SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Neural Information Processing Systems

Weaknesses: - The dependence on the camera pose somewhat limits the applicability. It also raises the question how much the proposed method relies on high quality camera poses. Similarly, it relies on high quality silhouette images and it is not quite clear how robust the method would be in practical setups with silhouettes coming from e.g. an instance segmentation algorithm. An experiment with different levels of noise for the camera parameters and less than perfect silhouettes would be helpful to asses the robustness in real-world settings. Other work directly optimized the MLPs or conditioned predictions from the MLPs via latent codes. I would suspect those approaches to be more robust than a network predicting network weights.


SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Neural Information Processing Systems

Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets. Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes, dramatically reducing the cost and effort of annotation. These techniques, however, remain impractical as they still require multi-view annotations of the same object instance during training. As a result, most experimental efforts to date have been limited to synthetic datasets. In this paper, we address this issue and propose SDF-SRN, an approach that requires only a single view of objects at training time, offering greater utility for real-world scenarios.


SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Lin, Chen-Hsuan, Wang, Chaoyang, Lucey, Simon

arXiv.org Artificial Intelligence

Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets. Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes, dramatically reducing the cost and effort of annotation. These techniques, however, remain impractical as they still require multi-view annotations of the same object instance during training. As a result, most experimental efforts to date have been limited to synthetic datasets. In this paper, we address this issue and propose SDF-SRN, an approach that requires only a single view of objects at training time, offering greater utility for real-world scenarios. SDF-SRN learns implicit 3D shape representations to handle arbitrary shape topologies that may exist in the datasets. To this end, we derive a novel differentiable rendering formulation for learning signed distance functions (SDF) from 2D silhouettes. Our method outperforms the state of the art under challenging single-view supervision settings on both synthetic and real-world datasets.