Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE
We propose a number of new algorithms for learning deep energy models from data motivated by a recent Stein variational gradient descent (SVGD) algorithm, including a Stein contrastive divergence (SteinCD) that integrates CD with SVGD based on their theoretical connections, and a SteinGAN that trains an auxiliary generator to generate the negative samples in maximum likelihood estimation (MLE). We demonstrate that our SteinCD trains models with good generalization (high test likelihood), while Stein-GAN can generate realistic looking images competitive with GAN-style methods. We show that by combing SteinCD and SteinGAN, it is possible to inherent the advantage of both approaches.
Jul-3-2017