End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization

Taniguchi, Shohei, Suzuki, Masahiro, Iwasawa, Yusuke, Matsuo, Yutaka

arXiv.org Artificial Intelligence 

Wu, 2017; Ma, 2020), multimodal learning (Srivastava & Salakhutdinov, 2012), and collaborative filtering (Salakhutdinov We address the problem of biased gradient estimation et al., 2007). Boltzmann machines also have the in deep Boltzmann machines (DBMs). The potential as powerful generative models because it is known existing method to obtain an unbiased estimator as a universal approximator of the probability mass function uses a maximal coupling based on a Gibbs sampler, on discrete variables (Le Roux & Bengio, 2008). Among but when the state is high-dimensional, it them, deep Boltzmann machines (DBMs) (Salakhutdinov & takes a long time to converge. In this study, we Larochelle, 2010), which are multi-layered undirected models, propose to use a coupling based on the Metropolis-can capture complex structures by their deep structure Hastings (MH) and to initialize the state around while retaining the advantages of the Boltzmann machine.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found