Reviews: Leveraging the Exact Likelihood of Deep Latent Variable Models

Neural Information Processing Systems 

Updated Review after Rebuttal: After reading the authors response and re-evaluate the paper I do agree that most of my concerns that there was a fundamental issue with some of their statements were wrong, hence I'm changing my score from 3 to 6. From going into detail of the proof it resides on constructing a generative model where for half of the latent variables (w t z 0) the integral is bounded for all data points and for the other half for 1 data point the integral diverges while for the other goes to zero. This split allows them to say that the one integral diverges and the all of the others are finite hence the likelihood is infinite. However, I'm still not convinced that this issue actually arises at all in practical settings. First, in practice, we are optimizing an ELBO which is never tight, hence for this to be convincing argument the authors should investigate whether there are settings of the ELBO where it diverges except when it can perfectly reconstruct the posterior. Furthermore, I still stand that I do not think that the results on the Frey Faces dataset are interpreted correctly and given that this is a fairly small dataset it is highly likely that the generative model overfits to the data (but not in the way for the divergence to happen). The experimental section in this direction seems to be a bit weak, nevertheless, the paper is worth being accepted.