Goto

Collaborating Authors

 contrastive divergence




Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces

Neural Information Processing Systems

However, training EBMs on data in discrete or mixed state spaces poses significant challenges due to the lack of robust and fast sampling methods. In this work, we propose to train discrete EBMs with Energy Discrepancy, a loss function which only requires the evaluation of the energy function at data points and their perturbed counterparts, thus eliminating the need for Markov chain Monte Carlo.


Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces

Neural Information Processing Systems

However, training EBMs on data in discrete or mixed state spaces poses significant challenges due to the lack of robust and fast sampling methods. In this work, we propose to train discrete EBMs with Energy Discrepancy, a loss function which only requires the evaluation of the energy function at data points and their perturbed counterparts, thus eliminating the need for Markov chain Monte Carlo.




Local Learning Rules for Out-of-Equilibrium Physical Generative Models

arXiv.org Artificial Intelligence

AMOLF, Science Park 104, 1098 XG Amsterdam, The Netherlands (Dated: August 28, 2025) We show that the out-of-equilibrium driving protocol of score-based generative models (SGMs) can be learned via local learning rules. The gradient with respect to the parameters of the driving protocol is computed directly from force measurements or from observed system dynamics. As a demonstration, we implement an SGM in a network of driven, nonlinear, overdamped oscillators coupled to a thermal bath. We first apply it to the problem of sampling from a mixture of two Gaussians in 2D. Finally, we train a 12 12 oscillator network on the MNIST dataset to generate images of handwritten digits "0" and "1".


Training Restricted Boltzmann Machine via the Thouless-Anderson-Palmer free energy

Neural Information Processing Systems

Restricted Boltzmann machines are undirected neural networks which have been shown tobe effective in many applications, including serving as initializations fortraining deep multi-layer neural networks. One of the main reasons for their success is theexistence of efficient and practical stochastic algorithms, such as contrastive divergence,for unsupervised training. We propose an alternative deterministic iterative procedure based on an improved mean field method from statistical physics known as the Thouless-Anderson-Palmer approach. We demonstrate that our algorithm provides performance equal to, and sometimes superior to, persistent contrastive divergence, while also providing a clear and easy to evaluate objective function. We believe that this strategycan be easily generalized to other models as well as to more accurate higher-order approximations, paving the way for systematic improvements in training Boltzmann machineswith hidden units.


Jarzynski Reweighting and Sampling Dynamics for Training Energy-Based Models: Theoretical Analysis of Different Transition Kernels

arXiv.org Artificial Intelligence

Energy-Based Models (EBMs) provide a flexible framework for generative modeling, but their training remains theoretically challenging due to the need to approximate normalization constants and efficiently sample from complex, multi-modal distributions. Traditional methods, such as contrastive divergence and score matching, introduce biases that can hinder accurate learning. In this work, we present a theoretical analysis of Jarzynski reweighting, a technique from non-equilibrium statistical mechanics, and its implications for training EBMs. We focus on the role of the choice of the kernel and we illustrate these theoretical considerations in two key generative frameworks: (i) flow-based diffusion models, where we reinterpret Jarzynski reweighting in the context of stochastic interpolants to mitigate discretization errors and improve sample quality, and (ii) Restricted Boltzmann Machines, where we analyze its role in correcting the biases of contrastive divergence. Our results provide insights into the interplay between kernel choice and model performance, highlighting the potential of Jarzynski reweighting as a principled tool for generative learning.


Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces

arXiv.org Machine Learning

However, training EBMs on data in discrete or mixed state spaces poses significant challenges due to the lack of robust and fast sampling methods. In this work, we propose to train discrete EBMs with Energy Discrepancy, a loss function which only requires the evaluation of the energy function at data points and their perturbed counterparts, thus eliminating the need for Markov chain Monte Carlo. We introduce perturbations of the data distribution by simulating a diffusion process on the discrete state space endowed with a graph structure. This allows us to inform the choice of perturbation from the structure of the modelled discrete variable, while the continuous time parameter enables fine-grained control of the perturbation. Empirically, we demonstrate the efficacy of the proposed approaches in a wide range of applications, including the estimation of discrete densities with non-binary vocabulary and binary image modelling. Finally, we train EBMs on tabular data sets with applications in synthetic data generation and calibrated classification.