A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows

Jul-8-2025–arXiv.org Artificial Intelligence

Sampling from a target probability distribution whose density is known up to a normalisation constant is a fundamental task in computational statistics and machine learning. A natural way to formulate this task is optimisation of a functional measuring the dissimilarity to the target probability distribution. Following this point of view, one can derive many popular sampling frameworks including variational inference [Blei et al., 2017], algorithms based on diffusions [Roberts and Tweedie, 1996, Durmus et al., 2019] and deterministic flows [Liu, 2017], and algorithms based on importance sampling [Chopin et al., 2024, Crucinio and Pathiraja, 2025]. The connection between minimisation of a divergence and Monte Carlo algorithms is established through gradient flows over the space of probability measures (see, e.g., Chewi et al. [2025], Carrillo et al. [2024] for a recent review); with different metrics over this space leading to different differential equations whose discretisations correspond to many popular Monte Carlo algorithms. The most widely used divergence is the reverse Kullback-Leibler (KL) divergence whose gradient flow w.r.t. the Wasserstein-2 metric can be implemented by a Langevin diffusion [Jordan et al., 1998] and easily discretised in time, resulting in the Unadjusted Langevin algorithm [Roberts and Tweedie, 1996].

artificial intelligence, divergence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Jul-8-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Italy (0.15)
- Asia > Middle East
  - Jordan (0.25)

Genre:
- Research Report (0.65)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found