Review for NeurIPS paper: 3D Shape Reconstruction from Vision and Touch

Neural Information Processing Systems 

I would suggest removing this claim -- in machine learning we seem to anthropomorphize our algorithms with little evidence. "...touch provides high fidelity localized information while vision provides complementary global context" A counterexample to this claim is the case of congenitally blind people who seem to have no problem describing global context of things they touch. See "Imagery in the congenitally blind: How visual are visual images?", Zimler and Keenan 1983 - The way the paper presents the idea of using charts makes it seem like it is a novel contribution, but in reality it is built on top of AtlasNet, who also use the term chart to describe their method. In fact, a follow up paper to AtlasNet [a] generalizes the charts idea even further which the paper does not cite. Therefore, I would suggest toning down statements that make it seem like this is a novel contribution such as "...which we call charts."