Learning with a Wasserstein Loss

Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya, Tomaso A. Poggio

Oct-2-2025, 12:17:00 GMT–Neural Information Processing Systems

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions. In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Y ahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric.

prediction, wasserstein distance, wasserstein loss, (14 more...)

Neural Information Processing Systems

Oct-2-2025, 12:17:00 GMT

Conferences PDF

Add feedback

Country:
- Europe > Poland (0.04)
- North America > United States
  - Michigan (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.05)

Industry:
- Information Technology (0.56)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Statistical Learning (1.00)
  - Representation & Reasoning (0.94)

Duplicate Docs Excel Report

Title
Learning with a Wasserstein Loss
Learning with a Wasserstein Loss Hossein Mobahi Center for Brains, Minds and Machines

Similar Docs Excel Report more

Title	Similarity	Source
None found