Reviews: Wasserstein Training of Restricted Boltzmann Machines

Jan-20-2025, 13:07:02 GMT–Neural Information Processing Systems

The paper refers to [2] and says that those authors proved statistical consistency. However, I am then surprised to see in section 4.3 that non-zero shrinkage is obtained (including for gamma 0) for the very simple case of modelling a N(0,I) distribution with N(0, sigma 2 I). What is going on here?? A failure of consistency would be a serious flaw in the formulation of a statistical learning criterion. Also in sec 3 (Stability and KL regularization) the authors say that at least for learning based on samples (\hat{p}_{theta}) that some regularization wrt the KL divergence is required. This clearly weakens the "purity" of the smoothed Wasserstein objective fn.

consistency, restricted boltzmann machine, wasserstein training, (10 more...)

Neural Information Processing Systems

Jan-20-2025, 13:07:02 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.40)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.40)