AITopics | corollary 3

This paper studies sampling error bounds for denoising diffusion probabilistic models (DDPMs) in the 2-Wasserstein distance. Our contributions are threefold. (i) Under general Lipschitz-type conditions on the score function and for a broad class of variance schedules, including the cosine schedule, we establish sharp upper bounds that are optimal in both the dimension and the number of steps, and recover several sharp error bounds previously obtained in the literature. (ii) We prove that the same Lipschitz-type conditions, which encompass those commonly imposed on the (learned) score, imply a logarithmic Sobolev inequality and hence a quadratic transportation cost inequality for the DDPM. As a consequence, in settings covered by existing work, an optimal Wasserstein bound, up to a logarithmic factor, follows from the recently obtained sharp error bound in the Kullback-Leibler divergence under geometric-type variance schedules. (iii) We show that for general log-concave target distributions, the optimal Wasserstein error bound remains attainable even without a quadratic transportation cost inequality for the target. Our analysis is based on viewing the DDPM sampler as a discretization of the Föllmer process rather than the conventional reverse Ornstein-Uhlenbeck process.

artificial intelligence, lemma 4, machine learning, (17 more...)

arXiv.org Machine Learning

2605.18069

Country:

Asia (0.46)
Europe > France (0.28)

Genre:

Research Report (1.00)
Overview (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.60)

Add feedback

Variational predictive resampling

Battaglia, Laura, Cortinovis, Stefano, Holmes, Chris, Frazier, David T., Jewson, Jack

arXiv.org Machine LearningMay-14-2026

Bayesian inference provides principled uncertainty quantification, but accurate posterior sampling with MCMC can be computationally prohibitive for modern applications. Variational inference (VI) offers a scalable alternative and often yields accurate predictive distributions, but cheap variational families such as mean-field (MF) can produce over-concentrated approximations that miss posterior dependence. We propose variational predictive resampling (VPR), a scalable posterior sampling method that exploits VI's predictive strength within a predictive-resampling framework to better approximate the Bayesian posterior. Given a prior-likelihood pair, VPR repeatedly imputes future observations from the current variational predictive, updates the variational approximation after each imputation, and records the parameter value implied by the completed sample. We establish conditions under which the law of the parameter returned by VPR is well defined and show that its finite-horizon approximation converges to this limit. In a tractable Gaussian location model, we show that VPR with MF variational predictives converges to the exact Bayesian posterior, whereas the optimal MF-VI approximation retains a non-vanishing asymptotic gap. Experiments on linear regression, logistic regression, and hierarchical linear mixed-effects models demonstrate that VPR substantially improves posterior uncertainty quantification and recovers posterior dependence missed by MF-VI, while remaining computationally competitive with, and often more efficient than, MCMC.

artificial intelligence, machine learning, posterior, (18 more...)

arXiv.org Machine Learning

2605.11168

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

460b491b917d4185ed1f5be97229721a-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 16:23:25 GMT

artificial intelligence, machine learning, outlier, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

8 max

Neural Information Processing SystemsApr-25-2026, 10:12:16 GMT

We proceed to show the sparsistency510 of the estimated parameters. First, suppose that Θ t;ij 6= 0 for some time tand index (i,j). Due to 0 < γ < 1, the above inequality implies that bΘt;ij = 0521 for every t and (i,j) 6 St, and bΘt;ij bΘt 1;ij = 0 for every t > 0 and (i,j) 6 Dt. The proof is inspired527 by Corollary 1 in [47]. First, we present the following key lemmas.528

artificial intelligence, precision matrix, runtime, (17 more...)

Neural Information Processing Systems

Industry: Banking & Finance > Trading (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

214cfbe603b7f9f9bc005d5f53f7a1d3-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 01:54:35 GMT

In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made? For such small datasets, test set bounds adversely affect generalisation performance by withholding data from the training procedure. In this setting, PAC-Bayes bounds are especially attractive, due to their ability to use all the data to simultaneouslylearn a posterior and bound its generalisation risk. We focus on the case of i.i.d.

Add feedback

02a92b52670752daf17b53f04f1ab405-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 05:59:35 GMT

artificial intelligence, machine learning, spanner, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.72)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

02a92b52670752daf17b53f04f1ab405-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 05:59:31 GMT

artificial intelligence, machine learning, spanner, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.98)

Add feedback

Optimal Learning for Multi-pass Stochastic Gradient Methods

Junhong Lin, Lorenzo Rosasco

Neural Information Processing SystemsApr-22-2026, 14:34:43 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, sgm, (14 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Add feedback

Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation

Ilija Bogunovic, Jonathan Scarlett, Andreas Krause, Volkan Cevher

Neural Information Processing SystemsApr-22-2026, 05:29:02 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Universality of Gaussian-Mixture Reverse Kernels in Conditional Diffusion

Ishtiaque, Nafiz, Haque, Syed Arefinul, Alam, Kazi Ashraful, Jahara, Fatima

arXiv.org Machine LearningApr-16-2026

We prove that conditional diffusion models whose reverse kernels are finite Gaussian mixtures with ReLU-network logits can approximate suitably regular target distributions arbitrarily well in context-averaged conditional KL divergence, up to an irreducible terminal mismatch that typically vanishes with increasing diffusion horizon. A path-space decomposition reduces the output error to this mismatch plus per-step reverse-kernel errors; assuming each reverse kernel factors through a finite-dimensional feature map, each step becomes a static conditional density approximation problem, solved by composing Norets' Gaussian-mixture theory with quantitative ReLU bounds. Under exact terminal matching the resulting neural reverse-kernel class is dense in conditional KL.

artificial intelligence, assumption 3, machine learning, (16 more...)

arXiv.org Machine Learning

2604.1347

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(9 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Filters

Collaborating Authors

corollary 3

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Wasserstein bounds for denoising diffusion probabilistic models via the Föllmer process

Variational predictive resampling

460b491b917d4185ed1f5be97229721a-Paper.pdf

8 max

214cfbe603b7f9f9bc005d5f53f7a1d3-Paper.pdf

02a92b52670752daf17b53f04f1ab405-Supplemental-Conference.pdf

02a92b52670752daf17b53f04f1ab405-Paper-Conference.pdf

Optimal Learning for Multi-pass Stochastic Gradient Methods

Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation

Universality of Gaussian-Mixture Reverse Kernels in Conditional Diffusion