Review for NeurIPS paper: 3D Shape Reconstruction from Vision and Touch

Neural Information Processing Systems 

This paper proposes to fuse vision and haptic information to reconstruct 3D shapes for robotic hand manipulation. The reconstruction is done by representing the objects as a collection of deformable meshes (defined as charts in the previously published AtlasNet paper). The merging of the vision and touch charts is done using graph convolutional networks, with local and cross-modality communication between charts. Experiments are conducted in simulation, on a new dataset designed by the authors, with known hand and object surface structure, and vision and touch inputs. After rebuttal, reviewers gave scores between 6 and 7.