Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

May-26-2025, 19:43:41 GMT–Neural Information Processing Systems

To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subsets of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set.

large language model, machine learning, natural language, (5 more...)

Neural Information Processing Systems

May-26-2025, 19:43:41 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.77)
  - Machine Learning > Memory-Based Learning
    - Rote Learning (0.73)