Review for NeurIPS paper: Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction
–Neural Information Processing Systems
This is true -- CMR did not backprop on texture loss. However this CVPR'20 work from Henderson et al. shows that you can https://arxiv.org/abs/2004.04180. This paper may not have been known to the authors (CVPR happened around the NeurIPS deadline), so I'm fine if they correct and discuss this point in the main paper. To me, it seems that these are the main differences between CMR, CSM, and the proposed approach: (i) CMR is akin to a direct method - backpropagation through the texture results in a photometric-like loss (it's not quite a photometric loss since a perceptual loss is used instead, but it's close enough); (ii) CSM learns to establish correspondences from image pixels to a fixed shape template that does not adapt to the depicted shape (their articulated-CSM follow-up CVPR 2020 paper allows the template to deform, but the shape deforms based on a semi-manually defined skeleton, which does not have the capacity to capture surface details); (iii) the proposed approach learns to establish correspondences from image pixels to the parameterized surface of a (C3DPO) shape basis that then deforms to the depicted shape. In the classical debate of direct versus correspondence methods, I view the proposed method as belonging to the latter camp. My hypothesis is, similar to how correspondence methods played out in the late 90s and 2000s, the proposed approach may be less susceptible to local minima than direct methods during shape-fitting optimization. But I think there's room to investigate this issue more fully, which may be outside the scope of this paper. Although I think (iii) is still a hybrid of CMR and CSM (but still with known keypoints). With that said, I'm changing my mind on this, I find this combination a reasonable idea.
Neural Information Processing Systems
Feb-12-2025, 02:39:36 GMT