End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization
Taniguchi, Shohei, Suzuki, Masahiro, Iwasawa, Yusuke, Matsuo, Yutaka
–arXiv.org Artificial Intelligence
Wu, 2017; Ma, 2020), multimodal learning (Srivastava & Salakhutdinov, 2012), and collaborative filtering (Salakhutdinov We address the problem of biased gradient estimation et al., 2007). Boltzmann machines also have the in deep Boltzmann machines (DBMs). The potential as powerful generative models because it is known existing method to obtain an unbiased estimator as a universal approximator of the probability mass function uses a maximal coupling based on a Gibbs sampler, on discrete variables (Le Roux & Bengio, 2008). Among but when the state is high-dimensional, it them, deep Boltzmann machines (DBMs) (Salakhutdinov & takes a long time to converge. In this study, we Larochelle, 2010), which are multi-layered undirected models, propose to use a coupling based on the Metropolis-can capture complex structures by their deep structure Hastings (MH) and to initialize the state around while retaining the advantages of the Boltzmann machine.
arXiv.org Artificial Intelligence
May-31-2023
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States
- Hawaii > Honolulu County
- Honolulu (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Hawaii > Honolulu County
- Asia > Japan
- Genre:
- Research Report > New Finding (0.48)
- Technology: