Bayesian Perspective on Memorization and Reconstruction

Kaplan, Haim, Mansour, Yishay, Nissim, Kobbi, Stemmer, Uri

arXiv.org Artificial Intelligence 

Carlini et al. [2019] showed that it is sometimes possible to extract unique pieces of training data from modern language models (such as credit card numbers). This demonstrates that such models can unintentionally memorize rare parts of their training data, even if those parts appear only once. Since then, this memorization phenomenon has been studied in a long line of work, providing increasingly many examples in which modern models unintentionally memorize data. In fact, several follow-up papers have shown that there exist learning tasks for which memorization is provably necessary [Feldman, 2020, Feldman and Zhang, 2020, Carlini et al., 2021, Brown et al., 2021, Haim et al., 2022, Buzaglo et al., 2023, Carlini et al., 2023a,b]. However, these prior works did not converge on a single definition of memorization, and instead considered several context-dependent notions.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found