Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
–Neural Information Processing Systems
Diffusion models (DMs) produce very detailed and high-quality images. Prior efforts prevent this issue by either changing the input to the diffusion process, thereby preventing the DM from generating memorized samples during inference, or removing the memorized data from training altogether. While those are viable solutions when the DM is developed and deployed in a secure and constantly monitored environment, they hold the risk of adversaries circumventing the safeguards and are not effective when the DM itself is publicly released. To solve the problem, we introduce NeMo, the first method to localize memorization of individual data samples down to the level of neurons in DMs' cross-attention layers. Through our experiments, we make the intriguing finding that in many cases, single neurons are responsible for memorizing particular training samples.
Neural Information Processing Systems
Mar-17-2025, 11:19:16 GMT
- Technology: