AITopics | Marco Cuturi

Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

Boris Muzellec, Marco Cuturi

Neural Information Processing SystemsMar-27-2025, 05:06:57 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, disintegration, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Differentiable Ranking and Sorting using Optimal Transport

Marco Cuturi, Olivier Teboul, Jean-Philippe Vert

Neural Information Processing SystemsMar-27-2025, 00:31:27 GMT

Sorting is used pervasively in machine learning, either to define elementary algorithms, such as k-nearest neighbors (k-NN) rules, or to define test-time metrics, such as top-k classification accuracy or ranking losses. Sorting is however a poor match for the end-to-end, automatically differentiable pipelines of deep learning. Indeed, sorting procedures output two vectors, neither of which is differentiable: the vector of sorted values is piecewise linear, while the sorting permutation itself (or its inverse, the vector of ranks) has no differentiable properties to speak of, since it is integer-valued. We propose in this paper to replace the usual sort procedure with a differentiable proxy. Our proxy builds upon the fact that sorting can be seen as an optimal assignment problem, one in which the n values to be sorted are matched to an auxiliary probability measure supported on any increasing family of n target values. From this observation, we propose extended rank and sort operators by considering optimal transport (OT) problems (the natural relaxation for assignments) where the auxiliary measure can be any weighted measure supported on m increasing values, where m n. We recover differentiable operators by regularizing these OT problems with an entropic penalty, and solve them by applying Sinkhorn iterations. Using these smoothed rank and sort operators, we propose differentiable proxies for the classification 0/1 loss as well as for the quantile regression loss.

artificial intelligence, machine learning, operator, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions

Boris Muzellec, Marco Cuturi

Neural Information Processing SystemsMar-26-2025, 21:45:12 GMT

Neural Information Processing Systems http://nips.cc/

data mining, elliptical distribution, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport

Theo Lacombe, Marco Cuturi, Steve OUDOT

Neural Information Processing SystemsMar-26-2025, 21:33:05 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, diagram, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Tree-Sliced Variants of Wasserstein Distances

Tam Le, Makoto Yamada, Kenji Fukumizu, Marco Cuturi

Neural Information Processing SystemsMar-23-2025, 00:41:57 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, tree metric, (13 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Differentiable Ranking and Sorting using Optimal Transport

Marco Cuturi, Olivier Teboul, Jean-Philippe Vert

Neural Information Processing SystemsJan-27-2025, 10:38:42 GMT

Sorting is used pervasively in machine learning, either to define elementary algorithms, such as k-nearest neighbors (k-NN) rules, or to define test-time metrics, such as top-k classification accuracy or ranking losses. Sorting is however a poor match for the end-to-end, automatically differentiable pipelines of deep learning. Indeed, sorting procedures output two vectors, neither of which is differentiable: the vector of sorted values is piecewise linear, while the sorting permutation itself (or its inverse, the vector of ranks) has no differentiable properties to speak of, since it is integer-valued. We propose in this paper to replace the usual sort procedure with a differentiable proxy. Our proxy builds upon the fact that sorting can be seen as an optimal assignment problem, one in which the n values to be sorted are matched to an auxiliary probability measure supported on any increasing family of n target values. From this observation, we propose extended rank and sort operators by considering optimal transport (OT) problems (the natural relaxation for assignments) where the auxiliary measure can be any weighted measure supported on m increasing values, where m n. We recover differentiable operators by regularizing these OT problems with an entropic penalty, and solve them by applying Sinkhorn iterations. Using these smoothed rank and sort operators, we propose differentiable proxies for the classification 0/1 loss as well as for the quantile regression loss.

artificial intelligence, machine learning, operator, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

Tree-Sliced Variants of Wasserstein Distances

Tam Le, Makoto Yamada, Kenji Fukumizu, Marco Cuturi

Neural Information Processing SystemsJan-22-2025, 13:48:26 GMT

Optimal transport (OT) theory defines a powerful set of tools to compare probability distributions. OT suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the sliced formulation, which exploits the closed-form formula between univariate distributions by projecting high-dimensional measures onto random lines. We consider in this work a more general family of ground metrics, namely tree metrics, which also yield fast closedform computations and negative definite, and of which the sliced-Wasserstein distance is a particular case (the tree is a chain). We propose the tree-sliced Wasserstein distance, computed by averaging the Wasserstein distance between these measures using random tree metrics, built adaptively in either low or highdimensional spaces. Exploiting the negative definiteness of that distance, we also propose a positive definite kernel, and test it against other baselines on a few benchmark tasks.

artificial intelligence, machine learning, tree metric, (13 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Wasserstein Training of Restricted Boltzmann Machines

Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi

Neural Information Processing SystemsJan-20-2025, 13:07:01 GMT

Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is known. This metric between observations can then be used to define the Wasserstein distance between the distribution induced by the Boltzmann machine on the one hand, and that given by the training sample on the other hand. We derive a gradient of that distance with respect to the model parameters. Minimization of this new objective leads to generative models with different statistical properties. We demonstrate their practical potential on data completion and denoising, for which the metric between observations plays a crucial role.

artificial intelligence, machine learning, wasserstein distance, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.34)

Industry: Government > Regional Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Stochastic Optimization for Large-scale Optimal Transport

Aude Genevay, Marco Cuturi, Gabriel Peyré, Francis Bach

Neural Information Processing SystemsJan-20-2025, 08:34:09 GMT

Optimal transport (OT) defines a powerful framework to compare probability distributions in a geometrically faithful way. However, the practical impact of OT is still limited because of its computational burden. We propose a new class of stochastic optimization algorithms to cope with large-scale OT problems. These methods can handle arbitrary distributions (either discrete or continuous) as long as one is able to draw samples from them, which is the typical setup in highdimensional learning problems.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions

Boris Muzellec, Marco Cuturi

Neural Information Processing SystemsOct-8-2024, 01:32:55 GMT

Embedding complex objects as vectors in low dimensional spaces is a longstanding problem in machine learning. We propose in this work an extension of that approach, which consists in embedding objects as elliptical probability distributions, namely distributions whose densities have elliptical level sets. We endow these measures with the 2-Wasserstein metric, with two important benefits: (i) For such measures, the squared 2-Wasserstein metric has a closed form, equal to a weighted sum of the squared Euclidean distance between means and the squared Bures metric between covariance matrices. The latter is a Riemannian metric between positive semi-definite matrices, which turns out to be Euclidean on a suitable factor representation of such matrices, which is valid on the entire geodesic between these matrices.

data mining, elliptical distribution, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Filters

Collaborating Authors

Marco Cuturi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

Differentiable Ranking and Sorting using Optimal Transport

Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions

Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport

Tree-Sliced Variants of Wasserstein Distances

Differentiable Ranking and Sorting using Optimal Transport

Tree-Sliced Variants of Wasserstein Distances

Wasserstein Training of Restricted Boltzmann Machines

Stochastic Optimization for Large-scale Optimal Transport

Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions