Review for NeurIPS paper: Deep Variational Instance Segmentation

Neural Information Processing Systems 

Weaknesses: The authors often call their method a "one step approach" and criticism other methods for using "heuristic postprocessing". I don't think the authors should be making these comments as there is a separate network to classify the masks. And I don't really buy the justification at the end of Page 6 that the proposed method needs to verify less masks than a typical two-stage method (ie Mask-RCNN) and is thus a "one-stage" method. Rather, I think the authors could be highlighting more the fact that their method does not require any anchors or region proposals, as I think this is a strong argument to be making. I also think the paper could benefit from an analysis of the number of instances that the network can predict. Ie, if the network was trained with a maximum of K instances in the training set, can it correctly predict more than K instances at test time?