AITopics | unintended memorization

Collaborating Authors

unintended memorization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning

Neural Information Processing SystemsDec-26-2025, 06:51:51 GMT

Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as déjà vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that déjà vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of déjà vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies.

name change, self-supervised learning, unintended memorization, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

How much do language models memorize?

Morris, John X., Sitawarin, Chawin, Guo, Chuan, Kokhlikyan, Narine, Suh, G. Edward, Rush, Alexander M., Chaudhuri, Kamalika, Mahloujifar, Saeed

arXiv.org Artificial IntelligenceJun-19-2025

We propose a new method for estimating how much a model knows about a datapoint and use it to measure the capacity of modern language models. Prior studies of language model memorization have struggled to disentangle memorization from generalization. We formally separate memorization into two components: unintended memorization, the information a model contains about a specific dataset, and generalization, the information a model contains about the true data-generation process. When we completely eliminate generalization, we can compute the total memorization, which provides an estimate of model capacity: our measurements estimate that GPT-style models have a capacity of approximately 3.6 bits per parameter. We train language models on datasets of increasing size and observe that models memorize until their capacity fills, at which point "grokking" begins, and unintended memorization decreases as models begin to generalize. We train hundreds of transformer language models ranging from $500K$ to $1.5B$ parameters and produce a series of scaling laws relating model capacity and data size to membership inference.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.24832

Country: North America > United States > Illinois (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation

Rabin, Rafiqul, McGregor, Sean, Judd, Nick

arXiv.org Artificial IntelligenceMar-27-2025

This paper explores the risk that a large language model (LLM) trained for code generation on data mined from software repositories will generate content that discloses sensitive information included in its training data. We decompose this risk, known in the literature as ``unintended memorization,'' into two components: unintentional disclosure (where an LLM presents secrets to users without the user seeking them out) and malicious disclosure (where an LLM presents secrets to an attacker equipped with partial knowledge of the training data). We observe that while existing work mostly anticipates malicious disclosure, unintentional disclosure is also a concern. We describe methods to assess unintentional and malicious disclosure risks side-by-side across different releases of training datasets and models. We demonstrate these methods through an independent assessment of the Open Language Model (OLMo) family of models and its Dolma training datasets. Our results show, first, that changes in data source and processing are associated with substantial changes in unintended memorization risk; second, that the same set of operational changes may increase one risk while mitigating another; and, third, that the risk of disclosing sensitive information varies not only by prompt strategies or test datasets but also by the types of sensitive information. These contributions rely on data mining to enable greater privacy and security testing required for the LLM training data supply chain.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.2276

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning

Neural Information Processing SystemsJan-19-2025, 12:39:24 GMT

self-supervised learning, ssl model, unintended memorization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Unintended Memorization in Large ASR Models, and How to Mitigate It

Wang, Lun, Thakkar, Om, Mathews, Rajiv

arXiv.org Artificial IntelligenceOct-18-2023

It is well-known that neural networks can unintentionally memorize their training examples, causing privacy concerns. However, auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging due to the high compute cost of existing methods such as hardness calibration. In this work, we design a simple auditing method to measure memorization in large ASR models without the extra compute overhead. Concretely, we speed up randomly-generated utterances to create a mapping between vocal and text information that is difficult to learn from typical training examples. Hence, accurate predictions only for sped-up training examples can serve as clear evidence for memorization, and the corresponding accuracy can be used to measure memorization. Using the proposed method, we showcase memorization in the state-of-the-art ASR models. To mitigate memorization, we tried gradient clipping during training to bound the influence of any individual example on the final model. We empirically show that clipping each example's gradient can mitigate memorization for sped-up training examples with up to 16 repetitions in the training set. Furthermore, we show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost while providing strong privacy protection.

inductive learning, machine learning, unintended memorization, (2 more...)

arXiv.org Artificial Intelligence

2310.11739

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

A Note On Interpreting Canary Exposure

Jagielski, Matthew

arXiv.org Artificial IntelligenceJun-2-2023

Canary exposure, introduced in Carlini et al. is frequently used to empirically evaluate, or audit, the privacy of machine learning model training. The goal of this note is to provide some intuition on how to interpret canary exposure, including by relating it to membership inference attacks and differential privacy.

artificial intelligence, exposure, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2306.00133

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Mitigating Unintended Memorization in Language Models via Alternating Teaching

Liu, Zhe, Zhang, Xuedong, Peng, Fuchun

arXiv.org Artificial IntelligenceOct-13-2022

Recent research has shown that language models have a tendency to memorize rare or unique sequences in the training corpora which can thus leak sensitive attributes of user data. We employ a teacher-student framework and propose a novel approach called alternating teaching to mitigate unintended memorization in sequential modeling. In our method, multiple teachers are trained on disjoint training sets whose privacy one wishes to protect, and teachers' predictions supervise the training of a student model in an alternating manner at each time step. Experiments on LibriSpeech datasets show that the proposed method achieves superior privacy-preserving results than other counterparts. In comparison with no prevention for unintended memorization, the overall utility loss is small when training records are sufficient.

machine learning, memorization, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.06772

Country: North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Understanding Unintended Memorization in Federated Learning

Thakkar, Om, Ramaswamy, Swaroop, Mathews, Rajiv, Beaufays, Françoise

arXiv.org Machine LearningJun-12-2020

Recent works have shown that generative sequence models (e.g., language models) have a tendency to memorize rare or unique sequences in the training data. Since useful models are often trained on sensitive data, to ensure the privacy of the training data it is critical to identify and mitigate such unintended memorization. Federated Learning (FL) has emerged as a novel framework for large-scale distributed learning tasks. However, it differs in many aspects from the well-studied central learning setting where all the data is stored at the central server. In this paper, we initiate a formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting. Our results show that several differing components of FL play an important role in reducing unintended memorization. Specifically, we observe that the clustering of data according to users---which happens by design in FL---has a significant effect in reducing such memorization, and using the method of Federated Averaging for training causes a further reduction. We also show that training with a strong user-level differential privacy guarantee results in models that exhibit the least amount of unintended memorization.

artificial intelligence, machine learning, memorization, (16 more...)

arXiv.org Machine Learning

2006.0749

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(9 more...)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Evaluating and testing unintended memorization in neural networks

RobohubAug-15-2019, 02:13:05 GMT

Defining memorization rigorously requires thought. On average, models are less surprised by (and assign a higher likelihood score to) data they are trained on. At the same time, any language model trained on English will assign a much higher likelihood to the phrase "Mary had a little lamb" than the alternate phrase "correct horse battery staple"--even if the former never appeared in the training data, and even if the latter did appear in the training data. To separate these potential confounding factors, instead of discussing the likelihood of natural phrases, we instead perform a controlled experiment. Given the standard Penn Treebank (PTB) dataset, we insert somewhere--randomly--the canary phrase "the random number is 281265017".

artificial intelligence, machine learning, memorization, (17 more...)

Robohub

Genre: Research Report (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback