Goto

Collaborating Authors

 hierarchical quantized autoencoder


309fee4e541e51de2e41f21bebb342aa-Paper.pdf

Neural Information Processing Systems

The internet age relies on lossy compression algorithms that transmit information at low bitrates. These algorithms are typically analysed through the rate-distortion trade-off, originally posited by Shannon[33].


Hierarchical Quantized Autoencoders

Neural Information Processing Systems

Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and abstract features at very low bitrates. Encouraged by recent success in learning discrete representations with Vector Quantized Variational Autoencoders (VQ-VAEs), we motivate the use of a hierarchy of VQ-VAEs to attain high factors of compression. We show that the combination of stochastic quantization and hierarchical latent structure aids likelihood-based image compression. This leads us to introduce a novel objective for training hierarchical VQ-VAEs. Our resulting scheme produces a Markovian series of latent variables that reconstruct images of high-perceptual quality which retain semantically meaningful features. We provide qualitative and quantitative evaluations on the CelebA and MNIST datasets.


Review for NeurIPS paper: Hierarchical Quantized Autoencoders

Neural Information Processing Systems

Weaknesses: I believe the contributions might be a bit overblown. In particular, in the first bullet point the authors mention an "analysis as to why probabilistic quantized hierarchies are particularly well-suited to optimising the perception-rate tradeoff when performing extreme lossy compression", but there doesn't seem to be a thorough and conclusive analysis on this matter. Points (2, 3) and (4, 5) are in my opinion spread too thin and should be combined. I find the model description somewhat lacking in clarity. I think everything would be much more understandable with a simple, but more structured, exposition (with equations) of the model, and possibly also of VQ-VAE, as this work heavily builds on it.


Review for NeurIPS paper: Hierarchical Quantized Autoencoders

Neural Information Processing Systems

Post-rebuttal, 3 out of 4 reviewers vote for acceptance while R1 esteems it is marginally below threshold. All 4 reviewers appreciated the convincing results in low-bitcount regimes. The two main points debated in the discussion phase concerned: a) whether there was sufficient novelty in the work w.r.t. For this, R4 convincingly argued that the paper significantly contributes to the learned compression sub-field. VQ-VAE, taking into account the author response and following reviewer discussion, the AC agrees that there are original aspects in the proposed algorithm that make it markedly different, with reasonable justification and ablation.


Hierarchical Quantized Autoencoders

Neural Information Processing Systems

Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and abstract features at very low bitrates. Encouraged by recent success in learning discrete representations with Vector Quantized Variational Autoencoders (VQ-VAEs), we motivate the use of a hierarchy of VQ-VAEs to attain high factors of compression. We show that the combination of stochastic quantization and hierarchical latent structure aids likelihood-based image compression. This leads us to introduce a novel objective for training hierarchical VQ-VAEs. Our resulting scheme produces a Markovian series of latent variables that reconstruct images of high-perceptual quality which retain semantically meaningful features.


Hierarchical Quantized Autoencoders

Williams, Will, Ringer, Sam, Ash, Tom, Hughes, John, MacLeod, David, Dougherty, Jamie

arXiv.org Machine Learning

Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and high-level features at very low bitrates. Encouraged by recent success in learning discrete representations with Vector Quantized Variational AutoEncoders (VQ-VAEs), we motivate the use of a hierarchy of VQ-VAEs to attain high factors of compression. We show that the combination of quantization and hierarchical latent structure aids likelihood-based image compression. This leads us to introduce a more probabilistic framing of the VQ-VAE, of which previous work is a limiting case. Our hierarchy produces a Markovian series of latent variables that reconstruct high-quality images which retain semantically meaningful features. These latents can then be further used to generate realistic samples. We provide qualitative and quantitative evaluations of reconstructions and samples on the CelebA and MNIST datasets.