AITopics | sinkhorn distance

Rethinking Losses for Diffusion Bridge Samplers

Neural Information Processing SystemsJun-20-2026, 14:20:17 GMT

Diffusion bridges are a promising class of deep-learning methods for sampling from unnormalized distributions. Recent works show that the Log Variance (LV) loss consistently outperforms the reverse Kullback-Leibler (rKL) loss when using the reparametrization trick to compute rKL-gradients. While the on-policy LV loss yields identical gradients to the rKL loss when combined with the log-derivative trick for diffusion samplers with non-learnable forward processes, this equivalence does not hold for diffusion bridges or when diffusion coefficients are learned. Based on this insight we argue that for diffusion bridges the LV loss does not represent an optimization objective that can be motivated like the rKL loss via the data processing inequality. Our analysis shows that employing the rKL loss with the log-derivative trick (rKL-LD) does not only avoid these conceptual problems but also consistently outperforms the LV loss. Experimental results with different types of diffusion bridges on challenging benchmarks show that samplers trained with the rKL-LD loss achieve better performance. From a practical perspective we find that rKL-LD requires significantly less hyperparameter optimization and yields more stable training behavior.1

artificial intelligence, bayesian inference, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.48)

Add feedback

Non-equilibrium Annealed Adjoint Sampler

Neural Information Processing SystemsJun-19-2026, 04:44:37 GMT

Recently, there has been significant progress in learning-based diffusion samplers, which aim to sample from a given unnormalized density. Many of these approaches formulate the sampling task as a stochastic optimal control (SOC) problem using a canonical uninformative reference process, which limits their ability to efficiently guide trajectories toward the target distribution. In this work, we propose the NonEquilibrium Annealed Adjoint Sampler (NAAS), a novel SOC-based diffusion framework that employs annealed reference dynamics as a non-stationary base SDE. This annealing structure provides a natural progression toward the target distribution and generates informative reference trajectories, thereby enhancing the stability and efficiency of learning the control. Owing to our SOC formulation, our framework can incorporate a variety of SOC solvers, thereby offering high flexibility in algorithmic design. As one instantiation, we employ a lean adjoint system inspired by adjoint matching, enabling efficient and scalable training. We demonstrate the effectiveness of NAAS across a range of tasks, including sampling from classical energy landscapes and molecular Boltzmann distributions.

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Modality-Agnostic Topology Aware Localization - Supplemental Material - Farhad G. Zanjani Ilia Karmanov Hanno Ackermann Daniel Dijkman Simone Merlin Max Welling Fatih Porikli Qualcomm AIResearch

Neural Information Processing SystemsApr-26-2026, 00:07:35 GMT

Triplet sampling was implemented based on the temporal vicinity of samples. Since the input is sequential, for each sample (called anchor) in the sequence, we consider a small and a large temporal window with predefined fixed widths. These two temporal windows are centered at the timestamp of the anchor. Any sample inside the smaller temporal window can be considered as a positive sample and any sample outside the small window but inside the large window can be considered as a negative sample. The widths of the temporal windows roughly depend on the speed of the observer in the environment.

artificial intelligence, experiment, machine learning, (12 more...)

Neural Information Processing Systems

Industry:

Telecommunications (0.42)
Semiconductors & Electronics (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

a70ee7ea485e4fd36abbfc4adf591c28-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 06:03:30 GMT

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Latvia > Riga Municipality > Riga (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

f55cadb97eaff2ba1980e001b0bd9842-Paper.pdf

Neural Information Processing SystemsFeb-15-2026, 03:24:04 GMT

algorithm, approximation, sinkhorn, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP

Neural Information Processing SystemsFeb-11-2026, 18:07:01 GMT

Vision-language pre-training methods, e.g., CLIP, demonstrate an impressive zero-shot performance on visual categorizations with the class proxy from the text embedding of the class name.

large language model, machine learning, proxy, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > Pierce County > Tacoma (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

569ff987c643b4bedf504efda8f786c2-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 18:28:17 GMT

experiment, prediction, prototype vector, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

COT-GAN: GeneratingSequentialData viaCausalOptimalTransport

Neural Information Processing SystemsFeb-8-2026, 16:25:50 GMT

Remarkably, we find that this causality condition provides a natural framework to parameterize the cost function that is learned by the discriminator as arobust (worst-case) distance, and anideal mechanism for learning time dependent data distributions.

artificial intelligence, machine learning, neurips, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Austria > Vienna (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

OnlineSinkhorn: OptimalTransport distancesfromsamplestreams

Neural Information Processing SystemsFeb-7-2026, 12:56:38 GMT

This paper introduces anew onlineestimator ofentropy-regularized OTdistances between twosucharbitrary distributions.

artificial intelligence, machine learning, sinkhorn, (16 more...)

Neural Information Processing Systems

Country:

Europe > France > Île-de-France > Paris > Paris (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Massively scalable Sinkhorn distances via the Nyström method

Neural Information Processing SystemsDec-26-2025, 03:46:32 GMT

The Sinkhorn distance, a variant of the Wasserstein distance with entropic regularization, is an increasingly popular tool in machine learning and statistical inference. However, the time and memory requirements of standard algorithms for computing this distance grow quadratically with the size of the data, rendering them prohibitively expensive on massive data sets. In this work, we show that this challenge is surprisingly easy to circumvent: combining two simple techniques--the Nyström method and Sinkhorn scaling--provably yields an accurate approximation of the Sinkhorn distance with significantly lower time and memory requirements than other approaches. We prove our results via new, explicit analyses of the Nyström method and of the stability properties of Sinkhorn scaling. We validate our claims experimentally by showing that our approach easily computes Sinkhorn distances on data sets hundreds of times larger than can be handled by other techniques.

massively scalable sinkhorn distance, name change, sinkhorn distance, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Filters

Collaborating Authors

sinkhorn distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Rethinking Losses for Diffusion Bridge Samplers

Non-equilibrium Annealed Adjoint Sampler

Modality-Agnostic Topology Aware Localization - Supplemental Material - Farhad G. Zanjani Ilia Karmanov Hanno Ackermann Daniel Dijkman Simone Merlin Max Welling Fatih Porikli Qualcomm AIResearch

a70ee7ea485e4fd36abbfc4adf591c28-Paper-Conference.pdf

f55cadb97eaff2ba1980e001b0bd9842-Paper.pdf

Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP

569ff987c643b4bedf504efda8f786c2-Supplemental.pdf

COT-GAN: GeneratingSequentialData viaCausalOptimalTransport

OnlineSinkhorn: OptimalTransport distancesfromsamplestreams

Massively scalable Sinkhorn distances via the Nyström method