surface light field
DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis
Generating controllable and photorealistic digital human avatars is a long-standing and important problem in Vision and Graphics. Recent methods have shown great progress in terms of either photorealism or inference speed while the combination of the two desired properties still remains unsolved. To this end, we propose a novel method, called DELIFFAS, which parameterizes the appearance of the human as a surface light field that is attached to a controllable and deforming human mesh model. At the core, we represent the light field around the human with a deformable two-surface parameterization, which enables fast and accurate inference of the human appearance. This allows perceptual supervision on the full image compared to previous approaches that could only supervise individual pixels or small patches due to their slow runtime. Our carefully designed human representation and supervision strategy leads to state-of-the-art synthesis results and inference time. The video results and code are available at https://vcai.
R1, R3: There is no ability to disentangle lighting and material, the paper is misleading in that aspect
We thank the reviewers for their insightful comments. We next address questions and comments raised in the reviews. R1, R3: There is no ability to disentangle lighting and material, the paper is misleading in that aspect. In section 3.2 we will clearly state that, in theory, incorporating the surface This is in contrast to MVS pipelines (e.g., In some cases, such as the "Fountain" scene, our method can go beyond R1, R2: Training and inference times are missing. All relevant details will be added to the text.
Review for NeurIPS paper: Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance
Clarity: As written, I find the explanation of the "implicit differentiable renderer" to be highly misleading. Throughout the paper, the network M is described as a "differentiable renderer" that accounts for both BRDF and lighting conditions. Indeed, the TITLE of the paper implies that lighting and materials are being recovered in the style of full inverse rendering. Section 3.2 introduces the rendering equation and makes a big deal about specifying the BRDF and light sources. However, this is all rendered pointless by lines 155-156: "Replacing M0 with a (sufficiently large) MLP approximation M provides the radiance approximation…" Rolling up all these factors into one big function means you are learning a surface light field, nothing more.
DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis
Kwon, Youngjoong, Liu, Lingjie, Fuchs, Henry, Habermann, Marc, Theobalt, Christian
Generating controllable and photorealistic digital human avatars is a long-standing and important problem in Vision and Graphics. Recent methods have shown great progress in terms of either photorealism or inference speed while the combination of the two desired properties still remains unsolved. To this end, we propose a novel method, called DELIFFAS, which parameterizes the appearance of the human as a surface light field that is attached to a controllable and deforming human mesh model. At the core, we represent the light field around the human with a deformable two-surface parameterization, which enables fast and accurate inference of the human appearance. This allows perceptual supervision on the full image compared to previous approaches that could only supervise individual pixels or small patches due to their slow runtime. Our carefully designed human representation and supervision strategy leads to state-of-the-art synthesis results and inference time. The video results and code are available at https://vcai.
NSLF-OL: Online Learning of Neural Surface Light Fields alongside Real-time Incremental 3D Reconstruction
Immersive novel view generation is an important technology in the field of graphics and has recently also received attention for operator-based human-robot interaction. However, the involved training is time-consuming, and thus the current test scope is majorly on object capturing. This limits the usage of related models in the robotics community for 3D reconstruction since robots (1) usually only capture a very small range of view directions to surfaces that cause arbitrary predictions on unseen, novel direction, (2) requires real-time algorithms, and (3) work with growing scenes, e.g., in robotic exploration. The paper proposes a novel Neural Surface Light Fields model that copes with the small range of view directions while producing a good result in unseen directions. Exploiting recent encoding techniques, the training of our model is highly efficient. In addition, we design Multiple Asynchronous Neural Agents (MANA), a universal framework to learn each small region in parallel for large-scale growing scenes. Our model learns online the Neural Surface Light Fields (NSLF) aside from real-time 3D reconstruction with a sequential data stream as the shared input. In addition to online training, our model also provides real-time rendering after completing the data stream for visualization. We implement experiments using well-known RGBD indoor datasets, showing the high flexibility to embed our model into real-time 3D reconstruction and demonstrating high-fidelity view synthesis for these scenes. The code is available on github.