Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
–Neural Information Processing Systems
To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subsets of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale LLaMA-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.
Neural Information Processing Systems
Mar-19-2025, 07:14:10 GMT
- Country:
- North America > United States (1.00)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Government > Regional Government
- Information Technology > Security & Privacy (0.46)
- Law (0.93)
- Technology: