Beta-VAE Reproducibility: Challenges and Extensions

Fil, Miroslav, Mesinovic, Munib, Morris, Matthew, Wildberger, Jonas

arXiv.org Artificial Intelligence 

Unsupervised learning is known to be brittle even on toy datasets and a meaningful, mathematically precise definition of disentanglement remains difficult to find. Here we investigate the original β-VAE paper and add evidence to the results previously obtained indicating its lack of reproducibility. We also further expand the experimentation of the models and include further more complex datasets in the analysis. We also implement an FID scoring metric for the β-VAE model and conclude a qualitative analysis of the results obtained. We end with a brief discussion on possible future investigations that can be conducted to add more robustness to the claims. Variational autoencoders (Kingma & Welling, 2014) are a class of unsupervised representation learning models with a principled probabilistic interpretation that extends normal autoencoders first described by Hinton & Salakhutdinov (2006). However, unsupervised learning is notoriously brittle even on toy datasets and a meaningful, mathematically precise definition of disentanglement remains difficult to find. It is thus not obvious to what extent β-VAEs can robustly obtain disentangled representations in different settings.