Bayesian Perspective on Memorization and Reconstruction

Kaplan, Haim, Mansour, Yishay, Nissim, Kobbi, Stemmer, Uri

May-30-2025–arXiv.org Artificial Intelligence

Carlini et al. [2019] showed that it is sometimes possible to extract unique pieces of training data from modern language models (such as credit card numbers). This demonstrates that such models can unintentionally memorize rare parts of their training data, even if those parts appear only once. Since then, this memorization phenomenon has been studied in a long line of work, providing increasingly many examples in which modern models unintentionally memorize data. In fact, several follow-up papers have shown that there exist learning tasks for which memorization is provably necessary [Feldman, 2020, Feldman and Zhang, 2020, Carlini et al., 2021, Brown et al., 2021, Haim et al., 2022, Buzaglo et al., 2023, Carlini et al., 2023a,b]. However, these prior works did not converge on a single definition of memorization, and instead considered several context-dependent notions.

artificial intelligence, attacker, machine learning, (19 more...)

arXiv.org Artificial Intelligence

May-30-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Israel (0.14)

Genre:
- Research Report (0.82)

Industry:
- Information Technology > Security & Privacy (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found