Rote Learning
Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models
Carragher, Peter, Jha, Abhinand, Raghav, R, Carley, Kathleen M.
Large Language Models (LLMs) demonstrate remarkable capabilities in question answering (QA), but metrics for assessing their reliance on memorization versus retrieval remain underdeveloped. Moreover, while finetuned models are state-of-the-art on closed-domain tasks, general-purpose models like GPT-4o exhibit strong zero-shot performance. This raises questions about the trade-offs between memorization, generalization, and retrieval. In this work, we analyze the extent to which multimodal retrieval-augmented VLMs memorize training data compared to baseline VLMs. Using the WebQA benchmark, we contrast finetuned models with baseline VLMs on multihop retrieval and question answering, examining the impact of finetuning on data memorization. To quantify memorization in end-to-end retrieval and QA systems, we propose several proxy metrics by investigating instances where QA succeeds despite retrieval failing. Our results reveal the extent to which finetuned models rely on memorization. In contrast, retrieval-augmented VLMs have lower memorization scores, at the cost of accuracy (72% vs 52% on WebQA test set). As such, our measures pose a challenge for future work to reconcile memorization and generalization in both Open-Domain QA and joint Retrieval-QA tasks.
Pruning as a Defense: Reducing Memorization in Large Language Models
Gupta, Mansi, Waghela, Nikhar, Gupta, Sarthak, Goel, Shourya, Shanmugavelu, Sanjif
Large language models have been shown to memorize significan t portions of their training data, which they can reproduce when appropriately prompted. This work investigates the impact of simple pruning techniques on thi s behavior. Our findings reveal that pruning effectively reduces the extent of m emorization in LLMs, demonstrating its potential as a foundational approach for mitigating membership inference attacks. Large language models are known to memorize portions of thei r training data, which poses significant privacy and security risks. Although various studies h ave explored the extent of memorization in LLMs, most of these efforts are qualitative (Carlini et al .
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks
Salido, Eva Sรกnchez, Gonzalo, Julio, Marco, Guillermo
In LLM evaluations, reasoning is often distinguished from recall/memorization by performing numerical variations to math-oriented questions. Here we introduce a general variation method for multiple-choice questions that completely dissociates the correct answer from previously seen tokens or concepts, requiring LLMs to understand and reason (rather than memorizing) in order to answer correctly. Using this method, we evaluate state-of-the-art proprietary and open-source LLMs on two datasets available in English and Spanish: the public MMLU benchmark and the private UNED-Access 2024 dataset. Results show that all models experience remarkable accuracy drops under our proposed variation, with an average loss of 57% on MMLU and 50% on UNED-Access 2024, ranging from 10% to 93% across models. Notably, the most accurate model in our experimentation (OpenAI-o3-mini) is not the most robust (DeepSeek-R1-70B), suggesting that the best models in standard evaluations may not be the ones with better reasoning capabilities. Also, we see larger accuracy drops in public (vs private) datasets and questions posed in their original language (vs a manual translation), which are signs of contamination and also point to a relevant role of recall/memorization in current LLMs' answers.
Logarithmic Width Suffices for Robust Memorization
Egosi, Amitsour, Yehudai, Gilad, Shamir, Ohad
The ability of neural networks to memorize labeled datasets is a central question in the study of their expressive power. Given some input domain X, output domain Y, and dataset size N, we say that a network memorizes datasets of size N, if for every labeled dataset D X Y, where |D| = N, we can find parameters such that the resulting network f: X Y perfectly fits the dataset (that is, f(x) = y for every labeled pair (x, y) D). The main question here - which has been studied in many recent works (see Section 2 for details) - is to characterize the size/architecture of the networks that have enough expressive power to memorize any dataset of a given size N. However, merely fitting a given dataset is not enough for most tasks, and a desirable property for trained networks is that they remain robust to noise and minor modifications in the dataset. This robustness property allows neural networks to generalize from observed data points to unseen data points. Furthermore, neural networks have been shown to be vulnerable to adversarial attacks [Szegedy et al., 2013, Carlini and Wagner, 2017, Papernot et al., 2017, Athalye et al., 2018] in the form of slightly perturbed examples, where (in the context of visual data) the perturbation is often imperceptible to the human eye. Moreover, existing constructions of memorizing networks are often quite delicate, and not at all robust to such perturbations. This motivates the question of characterizing the networks that have enough capacity to robustly memorize a dataset.
Counterfactual Memorization in Neural Language Models Chiyuan Zhang Daphne Ippolito Katherine Lee Google Research Carnegie Mellon University Google DeepMind
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data. Understanding this memorization is important in real world applications and also from a learningtheoretical perspective. An open question in previous studies of language model memorization is how to filter out "common" memorization. In fact, most memorization criteria strongly correlate with the number of occurrences in the training set, capturing memorized familiar phrases, public knowledge, templated texts, or other repeated data. We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training. We identify and study counterfactually-memorized training examples in standard text datasets. We estimate the influence of each memorized training example on the validation set and on generated texts, showing how this can provide direct evidence of the source of memorization at test time.
On Memorization in Probabilistic Deep Generative Models
Gerrit J.J. van den Burg, Christopher K.I. Williams
While experimenting with the proposed memorization score on CIFAR-10 [47], we noticed that the images of automobiles shown in Figure 6 are present in the training set multiple times (with slight variation). These works are recently proposed probabilistic generative models that achieve Figure 6: Examples of images impressive performance on sample quality metrics such as the inception from the CIFAR-10 training score (IS) [35] and the Frรฉchet inception distance (FID) [36], set that were spotted in illustrations and also achieve high log likelihoods. However, the fact that we of samples from the were able to serendipitously spot images from the training set in model in recent work on generative the generated samples might suggest that some unintended memorization models. We do not know if there are other images in the presented samples that are present in the training data. Of course, spotting near duplicates of training observations is only possible because these models yield realistic samples.
On Memorization in Probabilistic Deep Generative Models
Gerrit J.J. van den Burg, Christopher K.I. Williams
Recent advances in deep generative models have led to impressive results in a variety of application domains. Motivated by the possibility that deep learning models might memorize part of the input data, there have been increased efforts to understand how memorization arises. In this work, we extend a recently proposed measure of memorization for supervised learning (Feldman, 2019) to the unsupervised density estimation problem and adapt it to be more computationally efficient. Next, we present a study that demonstrates how memorization can occur in probabilistic deep generative models such as variational autoencoders. This reveals that the form of memorization to which these models are susceptible differs fundamentally from mode collapse and overfitting. Furthermore, we show that the proposed memorization score measures a phenomenon that is not captured by commonly-used nearest neighbor tests. Finally, we discuss several strategies that can be used to limit memorization in practice. Our work thus provides a framework for understanding problematic memorization in probabilistic generative models.
Captured by Captions: On Memorization and its Mitigation in CLIP Models
Wang, Wenhao, Dziedzic, Adam, Kim, Grace C., Backes, Michael, Boenisch, Franziska
Multi-modal models, such as CLIP, have demonstrated strong performance in aligning visual and textual representations, excelling in tasks like image retrieval and zero-shot classification. Despite this success, the mechanisms by which these models utilize training data, particularly the role of memorization, remain unclear. In uni-modal models, both supervised and self-supervised, memorization has been shown to be essential for generalization. However, it is not well understood how these findings would apply to CLIP, which incorporates elements from both supervised learning via captions that provide a supervisory signal similar to labels, and from self-supervised learning via the contrastive objective. To bridge this gap in understanding, we propose a formal definition of memorization in CLIP (CLIPMem) and use it to quantify memorization in CLIP models. Our results indicate that CLIP's memorization behavior falls between the supervised and self-supervised paradigms, with "mis-captioned" samples exhibiting highest levels of memorization. Additionally, we find that the text encoder contributes more to memorization than the image encoder, suggesting that mitigation strategies should focus on the text domain. Building on these insights, we propose multiple strategies to reduce memorization while at the same time improving utility--something that had not been shown before for traditional learning paradigms where reducing memorization typically results in utility decrease.
Review for NeurIPS paper: Early-Learning Regularization Prevents Memorization of Noisy Labels
Weaknesses: I have many reservation against the claims of the paper. I would appreciate it if the authors can clarify some of these issues during their rebuttal. First, the proof of their main theorem about logistic regression has many issues. One key issue is that the authors make assumptions within the proof that are not clearly stated or justified upfront. For example, in Line 440 in the supplementary materials, the proof assumes that theta Tv .1.
Review for NeurIPS paper: Early-Learning Regularization Prevents Memorization of Noisy Labels
The paper studies the following interesting phenomenon (observed in the previous literature): when trained on the dataset with incorrectly labeled points (i.e. "label noise"), DNNs first learn the benign ("correctly labeled") points and once this is done they start "memorizing" the noisy points. It was previously shown in the literature (empirically) that the second "memorization" phase hurts the generalization. The authors make 2 Contributions: (Contribution 1) They demonstrate (empirically and theoretically) that similar phenomenon can be observed in the simpler setting of the over-parametrized (dimensionality number of points) linear two-class logistic regression, when the class distributions are isotropic Gaussian with fixed means \pm mu and vanishing variance (see Theorem 1 and Figure A.1). (Contribution 2) Motivated by the theory of contribution 1, the authors propose a novel regularizer. When used in the vanilla DNN training with the cross-entropy loss, this regularizer successfully prevents the networks from falling to the "memorization phase" (as evidenced by Figure 1). All the reviewers agree that the topic and the focus of this paper is very timely.