Goto

Collaborating Authors

 learning energy-based latent variable model


Review for NeurIPS paper: Bi-level Score Matching for Learning Energy-based Latent Variable Models

Neural Information Processing Systems

Weaknesses: The authors neglect to compare to probably the 2 most related works I am aware of. The authors briefly mention variational noise contrastive estimation which can also be used to train models like those presented in this work. While this method has not yet been shown to scale to high dimensional image data it should be used as a comparison for the toy data at the very least. This work: "Variational Inference for Sparse and Undirected Models" Ingraham & Marks provides a method for parameter inference in EBLVMs. This method could also be used for comparison but at the very least should be included in the related work. The proposed method requires 2 inner loop optimizations (N x K) for each model gradient update.


Review for NeurIPS paper: Bi-level Score Matching for Learning Energy-based Latent Variable Models

Neural Information Processing Systems

All reviewers agree this is interesting work that succefsully trains energy-based latent variable models with score matching. There were concerns around clarity of the algorithm, utility of latent variables, complexity of the bi-level optimization proess, and missing baselines, which should all be addressed (as promised in the rebuttal) in the final verison of the paper.


Bi-level Score Matching for Learning Energy-based Latent Variable Models

Neural Information Processing Systems

Score matching (SM) provides a compelling approach to learn energy-based models (EBMs) by avoiding the calculation of partition function. However, it remains largely open to learn energy-based latent variable models (EBLVMs), except some special cases. This paper presents a bi-level score matching (BiSM) method to learn EBLVMs with general structures by reformulating SM as a bi-level optimization problem. The higher level introduces a variational posterior of the latent variables and optimizes a modified SM objective, and the lower level optimizes the variational posterior to fit the true posterior. To solve BiSM efficiently, we develop a stochastic optimization algorithm with gradient unrolling.