Review for NeurIPS paper: Grasp Proposal Networks: An End-to-End Solution for Visual Learning of Robotic Grasps

Neural Information Processing Systems 

This paper proposes an approach to predict multiple stable 6-dof grasp parameters for standard parallel-jaw grippers from object point cloud inputs, with associated confidence values. Grasps are represented as tuples of (contact points of the 2 jaws and the pitch angle of the gripper), which motivates the new architectural choices proposed here, inspired by standard architectures in 2D object detection. While the network is trained end-to-end, it is internally decomposed in a sensible stage-wise manner. They also create a synthetic 22.6M 6-DOF grasp dataset built on ShapeNet objects using physics simulation, which upon public release, will be the largest such dataset. Finally, there are some limited transfer results that demonstrate transferability to real-world grasping with acceptable performance drop.