Reviews: Generating Diverse High-Fidelity Images with VQ-VAE-2
–Neural Information Processing Systems
In addition, the model also inherit the nice property of AE-based models that it does not suffer from the mode collapse issue. However, it seems to me that the only difference between this paper and the VQ-VAE paper is that this work introduces the hierarchical structure to learn different levels of latent representations and priors. The novelty looks a bit low. In addition, this paper didn't provide any idea about why such a design can make the generative performance better. The loss function (2) is not a reasonable objective to optimize considering the stop gradient operator. During the optimization procedure, the loss function may increase by taking a step in the gradient directions.
Neural Information Processing Systems
Jan-24-2025, 03:32:40 GMT
- Technology: