Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation

Oct-11-2025, 00:47:15 GMT–Neural Information Processing Systems

Diffusion models excel at generating visually striking content from text but can inadvertently produce undesirable or harmful content when trained on unfiltered internet data. A practical solution is to selectively removing target concepts from the model, but this may impact the remaining concepts. Prior approaches have tried to balance this by introducing a loss term to preserve neutral content or a regularization term to minimize changes in the model parameters, yet resolving this trade-off remains challenging.

diffusion model, experiment, target concept, (16 more...)

Neural Information Processing Systems

Oct-11-2025, 00:47:15 GMT

Conferences PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- Europe > France
  - Île-de-France > Paris > Paris (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.93)

Industry:
- Information Technology > Security & Privacy (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
f02d7fb7ddd2e6be33b6f3224e5cc44a-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found