Reviews: Parameter Learning for Log-supermodular Distributions

Neural Information Processing Systems 

Technical quality: The math seems correct, though in general showing a few more intermediate steps in derivations would make the work of the reader easier (can push the proof to the supplement to make room). In the proof of Proposition 1 for instance, it would be nice to have more detail on how the final equality is obtained. The experiments compare to an SVM baseline, but not to the parameter learning done in [9]. Specifically, [9] does parameter learning for the spin glass model and shows that it achieves an error of 1.8% compared to an SVM's 8.2%. What is different about the learning done in this work (besides the use of a different probabilistic model)?