Review for NeurIPS paper: Bi-level Score Matching for Learning Energy-based Latent Variable Models

Neural Information Processing Systems 

Weaknesses: The authors neglect to compare to probably the 2 most related works I am aware of. The authors briefly mention variational noise contrastive estimation which can also be used to train models like those presented in this work. While this method has not yet been shown to scale to high dimensional image data it should be used as a comparison for the toy data at the very least. This work: "Variational Inference for Sparse and Undirected Models" Ingraham & Marks provides a method for parameter inference in EBLVMs. This method could also be used for comparison but at the very least should be included in the related work. The proposed method requires 2 inner loop optimizations (N x K) for each model gradient update.