Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution

Park, Yonghyun, Lai, Chieh-Hsin, Hayakawa, Satoshi, Takida, Yuhta, Murata, Naoki, Liao, Wei-Hsiang, Choi, Woosung, Cheuk, Kin Wai, Koo, Junghyun, Mitsufuji, Yuki

Jul-28-2025–arXiv.org Artificial Intelligence

While diffusion models excel at image generation, their growing adoption raises critical concerns around copyright issues and model transparency. Existing attribution methods identify training examples influencing an entire image, but fall short in isolating contributions to specific elements, such as styles or objects, that matter most to stakeholders. To bridge this gap, we introduce \emph{concept-level attribution} via a novel method called \emph{Concept-TRAK}. Concept-TRAK extends influence functions with two key innovations: (1) a reformulated diffusion training loss based on diffusion posterior sampling, enabling robust, sample-specific attribution; and (2) a concept-aware reward function that emphasizes semantic relevance. We evaluate Concept-TRAK on the AbC benchmark, showing substantial improvements over prior methods. Through diverse case studies--ranging from identifying IP-protected and unsafe content to analyzing prompt engineering and compositional learning--we demonstrate how concept-level attribution yields actionable insights for responsible generative AI development and governance.

artificial intelligence, attribution, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Jul-28-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.28)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Law (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.46)
  - Machine Learning
    - Neural Networks (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found