HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Takida, Yuhta, Ikemiya, Yukara, Shibuya, Takashi, Shimada, Kazuki, Choi, Woosung, Lai, Chieh-Hsin, Murata, Naoki, Uesaka, Toshimitsu, Uchida, Kengo, Liao, Wei-Hsiang, Mitsufuji, Yuki

Dec-30-2023–arXiv.org Artificial Intelligence

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical structures for making high-fidelity reconstructions. However, such hierarchical extensions of VQ-VAE often suffer from the codebook/layer collapse issue, where the codebook is not efficiently used to express the data, and hence degrades reconstruction accuracy. To mitigate this problem, we propose a novel unified framework to stochastically learn hierarchical discrete representation on the basis of the variational Bayes framework, called hierarchically quantized variational autoencoder (HQ-VAE). HQ-VAE naturally generalizes the hierarchical variants of VQ-VAE, such as VQ-VAE-2 and residual-quantized VAE (RQ-VAE), and provides them with a Bayesian training scheme. Our comprehensive experiments on image datasets show that HQ-VAE enhances codebook usage and improves reconstruction performance.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Dec-30-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (0.67)
      - Neural Networks > Deep Learning (0.93)
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.67)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)