Review for NeurIPS paper: Learning Restricted Boltzmann Machines with Sparse Latent Variables

Neural Information Processing Systems 

Additional Feedback: I found the phrase "few latent variables" to be confusing, since this is not really what the authors mean. In particular, their bounds use "s", which is bounded by the number of latent variables connected to any variable in the Markov blanket of Xi in the marginal distribution over the visible X. This was not clear, in my view, until the formal definition (150-155). Since this style of RBM is not well studied, the practical significance of learning rates is not clear. The type of model is intuitively appealing (many models use "local" latent variables or encourage sparseness), and perhaps the work would spur application of these styles of RBM, but it's difficult to say that this is improving the theory for a well-established problem of importance.