Supplementary Material for Learning Neural Implicit through Volume Rendering with Attentive Depth Fusion Priors
–Neural Information Processing Systems
All MLP decoders have 5 fully-connected blocks, each of which produces a hidden feature dimension of 32. For optimizing scene geometry, we use 60 iterations on Replica [10] and ScanNet [1]. For optimizing camera tracking, we use 10 iterations and 50 iterations on Replica [10] and ScanNet [1], respectively. After the tracking procedure at time step t, the after-fusion stage first fuses the t-th depth image into T that has fused all depth images in front using the estimated t-th camera pose. Beyond the average results in our paper, we report more detailed results in Tab. 1 and Tab. 2 on Replica [10] and ScanNet [1].
Neural Information Processing Systems
Feb-10-2025, 13:43:45 GMT