AITopics | cloob

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Andreas Fürst, Elisabeth Rumetshofer, Johannes Lehner, Viet Tran, Fei Tang, Hubert Ramsauer, David Kreil, Michael Kopp, Günter Klambauer, Angela Bitto-Nemling, Sepp Hochreiter

Neural Information Processing SystemsFeb-10-2026, 07:04:39 GMT

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

hopfield network, modern hopfield network, proceedings, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
(5 more...)

Genre: Research Report (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Neural Information Processing SystemsDec-24-2025, 15:01:45 GMT

CLIP yielded impressive results on zero-shot transfer learning tasks and is considered as a foundation model like BERT or GPT3. CLIP vision models that have a rich representation are pre-trained using the InfoNCE objective and natural language supervision before they are fine-tuned on particular tasks. Though CLIP excels at zero-shot transfer learning, it suffers from an explaining away problem, that is, it focuses on one or few features, while neglecting other relevant features. This problem is caused by insufficiently extracting the covariance structure in the original multi-modal data. We suggest to use modern Hopfield networks to tackle the problem of explaining away. Their retrieved embeddings have an enriched covariance structure derived from co-occurrences of features in the stored embeddings.

cloob, modern hopfield network, zero-shot transfer, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Andreas Fürst, Elisabeth Rumetshofer, Johannes Lehner, Viet Tran, Fei Tang, Hubert Ramsauer, David Kreil, Michael Kopp, Günter Klambauer, Angela Bitto-Nemling, Sepp Hochreiter

Neural Information Processing SystemsAug-16-2025, 12:03:55 GMT

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

artificial intelligence, hopfield network, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.68)

Genre: Research Report (0.67)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An experimental approach on Few Shot Class Incremental Learning

Adam, Marinela

arXiv.org Artificial IntelligenceMar-14-2025

Few-Shot Class-Incremental Learning (FSCIL) represents a cutting-edge paradigm within the broader scope of machine learning, designed to empower models with the ability to assimilate new classes of data with limited examples while safeguarding existing knowledge. The paper will present different solutions which contain extensive experiments across large-scale datasets, domain shifts, and network architectures to evaluate and compare the selected methods. We highlight their advantages and then present an experimental approach with the purpose of improving the most promising one by replacing the visual-language (V-L) model (CLIP) with another V-L model (CLOOB) that seem to outperform it on zero-shot learning tasks. The aim of this report is to present an experimental method for FSCIL that would improve its performance. We also plan to offer an overview followed by an analysis of the recent advancements in FSCIL domain, focusing on various strategies to mitigate catastrophic forgetting and improve the adaptability of models to evolving tasks and datasets.

class-incremental learning, dataset, learning, (13 more...)

arXiv.org Artificial Intelligence

2503.11349

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Neural Information Processing SystemsFeb-8-2025, 08:10:15 GMT

CLIP yielded impressive results on zero-shot transfer learning tasks and is considered as a foundation model like BERT or GPT3. CLIP vision models that have a rich representation are pre-trained using the InfoNCE objective and natural language supervision before they are fine-tuned on particular tasks. Though CLIP excels at zero-shot transfer learning, it suffers from an explaining away problem, that is, it focuses on one or few features, while neglecting other relevant features. This problem is caused by insufficiently extracting the covariance structure in the original multi-modal data. We suggest to use modern Hopfield networks to tackle the problem of explaining away. Their retrieved embeddings have an enriched covariance structure derived from co-occurrences of features in the stored embeddings.

infoloob outperform clip, modern hopfield network, zero-shot transfer, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.72)

Add feedback

Topological Perspectives on Optimal Multimodal Embedding Spaces

B, Abdul Aziz A., Rahim, A. B Abdul

arXiv.org Artificial IntelligenceMay-29-2024

Recent strides in multimodal model development have ignited a paradigm shift in the realm of text-to-image generation. Among these advancements, CLIP stands out as a remarkable achievement which is a sophisticated autoencoder adept at encoding both textual and visual information within a unified latent space. This paper delves into a comparative analysis between CLIP and its recent counterpart, CLOOB. To unravel the intricate distinctions within the embedding spaces crafted by these models, we employ topological data analysis. Our approach encompasses a comprehensive examination of the modality gap drivers, the clustering structures existing across both high and low dimensions, and the pivotal role that dimension collapse plays in shaping their respective embedding spaces. Empirical experiments substantiate the implications of our analyses on downstream performance across various contextual scenarios. Through this investigation, we aim to shed light on the nuanced intricacies that underlie the comparative efficacy of CLIP and CLOOB, offering insights into their respective strengths and weaknesses, and providing a foundation for further refinement and advancement in multimodal model research.

arxiv, clip and cloob, cloob, (15 more...)

arXiv.org Artificial Intelligence

2405.18867

Country: Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CLOOB: A New Contrastive Learning Method That Outperforms CLIP - AI Summary

#artificialintelligenceJan-3-2023, 23:50:23 GMT

The paper "CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP" introduces a new self-supervised learning method, where modern Hopfield networks boost contrastive learning using the InfoLOOB objective (Leave One Out Bound). CLOOB consistently outperforms CLIP at zero-shot transfer learning across different architectures and datasets.

ai summary, new contrastive learning method, outperform clip, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)

Add feedback

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Fürst, Andreas, Rumetshofer, Elisabeth, Lehner, Johannes, Tran, Viet, Tang, Fei, Ramsauer, Hubert, Kreil, David, Kopp, Michael, Klambauer, Günter, Bitto-Nemling, Angela, Hochreiter, Sepp

arXiv.org Artificial IntelligenceNov-7-2022

CLIP yielded impressive results on zero-shot transfer learning tasks and is considered as a foundation model like BERT or GPT3. CLIP vision models that have a rich representation are pre-trained using the InfoNCE objective and natural language supervision before they are fine-tuned on particular tasks. Though CLIP excels at zero-shot transfer learning, it suffers from an explaining away problem, that is, it focuses on one or few features, while neglecting other relevant features. This problem is caused by insufficiently extracting the covariance structure in the original multi-modal data. We suggest to use modern Hopfield networks to tackle the problem of explaining away. Their retrieved embeddings have an enriched covariance structure derived from co-occurrences of features in the stored embeddings. However, modern Hopfield networks increase the saturation effect of the InfoNCE objective which hampers learning. We propose to use the InfoLOOB objective to mitigate this saturation effect. We introduce the novel "Contrastive Leave One Out Boost" (CLOOB), which uses modern Hopfield networks for covariance enrichment together with the InfoLOOB objective. In experiments we compare CLOOB to CLIP after pre-training on the Conceptual Captions and the YFCC dataset with respect to their zero-shot transfer learning performance on other datasets. CLOOB consistently outperforms CLIP at zero-shot transfer learning across all considered architectures and datasets.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2110.11316

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
Europe > Croatia (0.04)
(17 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

#artificialintelligenceJun-13-2022, 07:43:39 GMT

Similar to the associative memory of humans, our approach uses associative memories to amplify co-occurences and the covariance structure. The associative memory of our choice is a modern Hopfield network because of its fast retrieval and high storage capacity, as shown in Hopfield networks is all you need. The update mechanism of modern Hopfield networks is equivalent to the self-attention mechanism of Transformer networks. However, modern Hopfield networks are more general and have a broader functionality, of which the Transformer self-attention is just one example. The according Hopfield layers can be built in Deep Learning architectures for associating two sets, encoder-decoder attention, multiple instance learning, or averaging and pooling operations.

artificial intelligence, hopfield network, machine learning, (6 more...)

#artificialintelligence

Technology: