AITopics | wasserstein loss

Collaborating Authors

wasserstein loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix for " Disentangled Wasserstein Autoencoder for Protein Engineering " Anonymous Author(s) Affiliation Address email 1 Data preparation 1 1.1 Combination of data sources 2

Neural Information Processing SystemsFeb-17-2026, 18:02:08 GMT

We repeat this process until the size of the negative set is 5x that of the positive set. The expanded dataset is then provided to the respective ERGO model. Any unobserved pair is treated as negative. Performance is shown in Table S2. TCRs that have more than one positive prediction or have at least one wrong prediction.

artificial intelligence, machine learning, sequence, (16 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.77)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix for " Disentangled Wasserstein Autoencoder for Protein Engineering " Anonymous Author(s) Affiliation Address email 1 Data preparation 1 1.1 Combination of data sources

Neural Information Processing SystemsOct-9-2025, 10:38:35 GMT

artificial intelligence, machine learning, sequence, (16 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.77)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning with a Wasserstein Loss

Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya, Tomaso A. Poggio

Neural Information Processing SystemsOct-2-2025, 12:17:00 GMT

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions. In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Y ahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric.

prediction, wasserstein distance, wasserstein loss, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Michigan (0.04)
Europe > Poland (0.04)

Industry: Information Technology (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

fba9d88164f3e2d9109ee770223212a0-Paper.pdf

Neural Information Processing SystemsAug-22-2025, 01:11:36 GMT

artificial intelligence, machine learning, texture, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Utah (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

TUSK_FULL

Yuhe Jin

Neural Information Processing SystemsAug-18-2025, 11:00:47 GMT

We then optimize the prototypes with a fixed encoder/decoder.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Texture Interpolation for Probing Visual Perception Jonathan Vacher

Neural Information Processing SystemsAug-17-2025, 09:44:40 GMT

This is because most pairs of axis have similar variances.

artificial intelligence, machine learning, texture interpolation path, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.06)
North America > Canada (0.05)
Europe > France > Île-de-France > Paris > Paris (0.05)

Industry: Health & Medicine (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

Learning with a Wasserstein Loss

Neural Information Processing SystemsAug-12-2025, 23:06:20 GMT

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions. In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric.

learning, wasserstein distance, wasserstein loss, (5 more...)

Neural Information Processing Systems

Industry: Information Technology (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Deep Generative Models: Complexity, Dimensionality, and Approximation

Wang, Kevin, Niu, Hongqian, Wang, Yixin, Li, Didong

arXiv.org Machine LearningApr-1-2025

Generative networks have shown remarkable success in learning complex data distributions, particularly in generating high-dimensional data from lower-dimensional inputs. While this capability is well-documented empirically, its theoretical underpinning remains unclear. One common theoretical explanation appeals to the widely accepted manifold hypothesis, which suggests that many real-world datasets, such as images and signals, often possess intrinsic low-dimensional geometric structures. Under this manifold hypothesis, it is widely believed that to approximate a distribution on a $d$-dimensional Riemannian manifold, the latent dimension needs to be at least $d$ or $d+1$. In this work, we show that this requirement on the latent dimension is not necessary by demonstrating that generative networks can approximate distributions on $d$-dimensional Riemannian manifolds from inputs of any arbitrary dimension, even lower than $d$, taking inspiration from the concept of space-filling curves. This approach, in turn, leads to a super-exponential complexity bound of the deep neural networks through expanded neurons. Our findings thus challenge the conventional belief on the relationship between input dimensionality and the ability of generative networks to model data distributions. This novel insight not only corroborates the practical effectiveness of generative networks in handling complex data structures, but also underscores a critical trade-off between approximation error, dimensionality, and model complexity.

artificial intelligence, dimension, machine learning, (20 more...)

arXiv.org Machine Learning

2504.0082

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > North Carolina (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)

Add feedback

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance

Esfandiarpoor, Reza, Zerveas, George, Zhang, Ruochen, Mgonzo, Macton, Eickhoff, Carsten, Bach, Stephen H.

arXiv.org Artificial IntelligenceMar-29-2025

Recent advancements in large language models (LLMs) have allowed the augmentation of information retrieval (IR) pipelines with synthetic data in various ways. Yet, the main training paradigm remains: contrastive learning with binary relevance labels and the InfoNCE loss, where one positive document is compared against one or more negatives. This objective treats all documents that are not explicitly annotated as relevant on an equally negative footing, regardless of their actual degree of relevance, thus (a) missing subtle nuances that are useful for ranking and (b) being susceptible to annotation noise. To overcome this limitation, in this work we forgo real training documents and annotations altogether and use open-source LLMs to directly generate synthetic documents that answer real user queries according to several different levels of relevance. This fully synthetic ranking context of graduated relevance, together with an appropriate list-wise loss (Wasserstein distance), enables us to train dense retrievers in a way that better captures the ranking task. Experiments on various IR datasets show that our proposed approach outperforms conventional training with InfoNCE by a large margin. Without using any real documents for training, our dense retriever significantly outperforms the same retriever trained through self-supervision. More importantly, it matches the performance of the same retriever trained on real, labeled training documents of the same dataset, while being more robust to distribution shift and clearly outperforming it when evaluated zero-shot on the BEIR dataset collection.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.23239

Country:

North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > Dominican Republic (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Education > Health & Safety > School Nutrition (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Fine-Tuning a Time Series Foundation Model with Wasserstein Loss

Chernov, Andrei

arXiv.org Artificial IntelligenceNov-18-2024

Inspired by recent advancements in large language models (LLMs) for Natural Language Processing (NLP), there has been a surge in research focused on developing foundational models for time series forecasting. One approach involves training LLM architectures on tokenized time series data using cross-entropy loss. Although this method has demonstrated promising results, cross-entropy loss is primarily designed for classification tasks and does not account for the distance between classes. To address this limitation, we propose using the Wasserstein loss for such architectures. To validate our approach, we fine-tuned a foundational time series model on $22$ zero-shot datasets, comparing the performance of cross-entropy loss with that of Wasserstein loss. Our results demonstrate that replacing cross-entropy loss with Wasserstein loss significantly improves point estimation.

cross-entropy loss, dataset, wasserstein loss, (13 more...)

arXiv.org Artificial Intelligence

2409.15367

Country:

North America > United States (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback