AITopics | contrastive model

Collaborating Authors

contrastive model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SelecMix: Debiased Learning by Contradicting-pair Sampling

Neural Information Processing SystemsJun-15-2026, 22:58:03 GMT

Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. However, these approaches are sometimes difficult to train and scale to real-world data because they rely on generative models or disentangled representations. We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples. Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features. Identifying such pairs requires comparing examples with respect to unknown biased features. For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training. Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples.

artificial intelligence, bias label, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Model Inversion with Layer-Specific Modeling and Alignment for Data-Free Continual Learning

Neural Information Processing SystemsJun-15-2026, 13:26:56 GMT

Continual learning (CL) aims to incrementally train a model to a sequence of tasks while maintaining performance on previously seen ones. Despite mitigating forgetting, data storage and replay are often infeasible due to privacy or security constraints and are impractical for arbitrary pre-trained models. Data-free or examplar-free CL aims to continually update models with new tasks without storing previous data. In addition to regularizing updates, we employ model inversion to synthesize data from the trained model, anchoring learned knowledge through replay without retaining old data. However, model inversion in predictive models faces two key challenges.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Workflow (0.66)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

b8b93c48f5bfa385d071342089d70422-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsApr-30-2026, 01:20:31 GMT

caption, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: Europe (0.93)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

VLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval

Neural Information Processing SystemsFeb-17-2026, 17:46:41 GMT

Compositionality is still a challenging problem.

caption, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Spain > Basque Country (0.04)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.84)
(2 more...)

Add feedback

36a16a2505369e0c922b6ea7a23a56d2-Reviews.html

Neural Information Processing SystemsOct-3-2025, 08:55:42 GMT

It is important to also demonstrate the robustness of the proposed algorithm on data relative to the naive model.

background, contrastive learning, foreground data, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Nevada (0.04)

Genre: Summary/Review (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

SelecMix: Debiased Learning by Contradicting-pair Sampling

Neural Information Processing SystemsAug-15-2025, 03:12:52 GMT

For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training.

auxiliary model, bias label, bias-conflicting example, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.30)

Add feedback

The "Law" of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities

Che, Yongwei, Eysenbach, Benjamin

arXiv.org Machine LearningJan-20-2025

While internet-scale data often comes in pairs (e.g., audio/image, image/text), we often want to perform inferences over modalities unseen together in the training data (e.g., audio/text). Empirically, this can often be addressed by learning multiple contrastive embedding spaces between existing modality pairs, implicitly hoping that unseen modality pairs will end up being aligned. This theoretical paper proves that this hope is well founded, under certain assumptions. Starting with the proper Bayesian approach of integrating out intermediate modalities, we show that directly comparing the representations of data from unpaired modalities can recover the same likelihood ratio. Our analysis builds on prior work on the geometry and probabilistic interpretation of contrastive representations, showing how these representations can answer many of the same inferences as probabilistic graphical models. Our analysis suggests two new ways of using contrastive representations: in settings with pre-trained contrastive models, and for handling language ambiguity in reinforcement learning. Our numerical experiments study the importance of our assumptions and demonstrate these new applications. Much of the appeal of contrastive learning is that it gives a "plug-n-play" approach for swapping one modality for another. Because representations from different modalities are trained to be aligned when representing the same object, the hope is that (say) a language representation and image representation of the same scene can be used as substitutes. This property is practically appealing for at least two reasons. First, it allows us to make use of pre-trained models. If you have a model that wants to make use of (say) language input and you have access to a pre-trained image-language contrastive model, you might simply train your model on the pre-trained image representations and hope that it will continue to work when you swap in the language representations.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Machine Learning

2501.11326

Genre: Research Report > New Finding (0.46)

Industry:

Government (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

When can we Approximate Wide Contrastive Models with Neural Tangent Kernels and Principal Component Analysis?

Anil, Gautham Govind, Esser, Pascal, Ghoshdastidar, Debarghya

arXiv.org Machine LearningMar-13-2024

Contrastive learning is a paradigm for learning representations from unlabelled data that has been highly successful for image and text data. Several recent works have examined contrastive losses to claim that contrastive models effectively learn spectral embeddings, while few works show relations between (wide) contrastive models and kernel principal component analysis (PCA). However, it is not known if trained contrastive models indeed correspond to kernel methods or PCA. In this work, we analyze the training dynamics of two-layer contrastive models, with non-linear activation, and answer when these models are close to PCA or kernel methods. It is well known in the supervised setting that neural networks are equivalent to neural tangent kernel (NTK) machines, and that the NTK of infinitely wide networks remains constant during training. We provide the first convergence results of NTK for contrastive losses, and present a nuanced picture: NTK of wide networks remains almost constant for cosine similarity based contrastive losses, but not for losses based on dot product similarity. We further study the training dynamics of contrastive models with orthogonality constraints on output layer, which is implicitly assumed in works relating contrastive learning to spectral embedding. Our deviation bounds suggest that representations learned by contrastive models are close to the principal components of a certain matrix computed from random features. We empirically show that our theoretical results possibly hold beyond two-layer networks.

arXiv.org Machine Learning

2403.08673

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback

FLGuard: Byzantine-Robust Federated Learning via Ensemble of Contrastive Models

Lee, Younghan, Cho, Yungi, Han, Woorim, Bae, Ho, Paek, Yunheung

arXiv.org Artificial IntelligenceMar-5-2024

Federated Learning (FL) thrives in training a global model with numerous clients by only sharing the parameters of their local models trained with their private training datasets. Therefore, without revealing the private dataset, the clients can obtain a deep learning (DL) model with high performance. However, recent research proposed poisoning attacks that cause a catastrophic loss in the accuracy of the global model when adversaries, posed as benign clients, are present in a group of clients. Therefore, recent studies suggested byzantine-robust FL methods that allow the server to train an accurate global model even with the adversaries present in the system. However, many existing methods require the knowledge of the number of malicious clients or the auxiliary (clean) dataset or the effectiveness reportedly decreased hugely when the private dataset was non-independently and identically distributed (non-IID). In this work, we propose FLGuard, a novel byzantine-robust FL method that detects malicious clients and discards malicious local updates by utilizing the contrastive learning technique, which showed a tremendous improvement as a self-supervised learning method. With contrastive models, we design FLGuard as an ensemble scheme to maximize the defensive capability. We evaluate FLGuard extensively under various poisoning attacks and compare the accuracy of the global model with existing byzantine-robust FL methods. FLGuard outperforms the state-of-the-art defense methods in most cases and shows drastic improvement, especially in non-IID settings.

dataset, flguard, local update, (16 more...)

arXiv.org Artificial Intelligence

2403.02846

Country: Asia > South Korea > Seoul > Seoul (0.05)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Universum-inspired Supervised Contrastive Learning

Han, Aiyang, Geng, Chuanxing, Chen, Songcan

arXiv.org Artificial IntelligenceOct-31-2023

As an effective data augmentation method, Mixup synthesizes an extra amount of samples through linear interpolations. Despite its theoretical dependency on data properties, Mixup reportedly performs well as a regularizer and calibrator contributing reliable robustness and generalization to deep model training. In this paper, inspired by Universum Learning which uses out-of-class samples to assist the target tasks, we investigate Mixup from a largely under-explored perspective - the potential to generate in-domain samples that belong to none of the target classes, that is, universum. We find that in the framework of supervised contrastive learning, Mixup-induced universum can serve as surprisingly high-quality hard negatives, greatly relieving the need for large batch sizes in contrastive learning. With these findings, we propose Universum-inspired supervised Contrastive learning (UniCon), which incorporates Mixup strategy to generate Mixup-induced universum as universum negatives and pushes them apart from anchor samples of the target classes. We extend our method to the unsupervised setting, proposing Unsupervised Universum-inspired contrastive model (Un-Uni). Our approach not only improves Mixup with hard labels, but also innovates a novel measure to generate universum data. With a linear classifier on the learned representations, UniCon shows state-of-the-art performance on various datasets. Specially, UniCon achieves 81.7% top-1 accuracy on CIFAR-100, surpassing the state of art by a significant margin of 5.2% with a much smaller batch size, typically, 256 in UniCon vs. 1024 in SupCon using ResNet-50. Un-Uni also outperforms SOTA methods on CIFAR-100. The code of this paper is released on https://github.com/hannaiiyanggit/UniCon.

learning, universum data, universum negative, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIP.2023.3290514

2204.10695

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback