Eccentric Regularization: Minimizing Hyperspherical Energy without explicit projection

Li, Xuefeng, Blair, Alan

arXiv.org Artificial Intelligence 

In recent years a number of regularization methods have been introduced which force the latent activations of an autoencoder or deep neural network to conform to either a hyperspherical or Gaussian distribution, in order to encourage diversity in the latent vectors, or to minimize the implicit rank of the distribution in latent space. Variational Autoencoders (VAE) (Kingma and Welling, 2014) and related variational methods such as β-VAE (Higgins et al., 2017) force the latent distribution to match a known prior distribution by minimizing the Kullback-Leibler divergence. Normally, a standard Gaussian distribution is used as the prior, but alternatives such as the hyperspherical distribution have also been investigated in the literature due to certain advantages (Davidson et al., 2018). More recently, deterministic alternatives have been proposed such as Wasserstein AutoEncoder (WAE) (Tolstikhin et al., 2018), VQ-VAE (van den Oord et al., 2017) and RAE (Ghosh et al., 2020). Several existing methods encourage diversity by maximizing pairwise dissimilarity between items, drawing inspiration in part from a 1904 paper by J.J. Thomson in which various classical models are proposed for maintaining the electrons of an atom in an appropriate formation around the nucleus (Thomson, 1904). Hyperspherical Energy Minimization (Liu et al., 2018) has been used to regularize the hidden unit

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found