TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Huang, Victor Shea-Jay, Zhuo, Le, Xin, Yi, Wang, Zhaokai, Gao, Peng, Li, Hongsheng

Mar-10-2025–arXiv.org Artificial Intelligence

Diffusion Transformers (DiTs) are a powerful yet underexplored class of generative models compared to U-Net-based diffusion models. To bridge this gap, we introduce TIDE (Temporal-aware Sparse Autoencoders for Interpretable Diffusion transformErs), a novel framework that enhances temporal reconstruction within DiT activation layers across denoising steps. TIDE employs Sparse Autoencoders (SAEs) with a sparse bottleneck layer to extract interpretable and hierarchical features, revealing that diffusion models inherently learn hierarchical features at multiple levels (e.g., 3D, semantic, class) during generative pre-training. Our approach achieves state-of-the-art reconstruction performance, with a mean squared error (MSE) of 1e-3 and a cosine similarity of 0.97, demonstrating superior accuracy in capturing activation dynamics along the denoising trajectory. Beyond interpretability, we showcase TIDE's potential in downstream applications such as sparse activation-guided image editing and style transfer, enabling improved controllability for generative systems. By providing a comprehensive training and evaluation protocol tailored for DiTs, TIDE contributes to developing more interpretable, transparent, and trustworthy generative models.

arxiv preprint arxiv, diffusion model, representation, (12 more...)

arXiv.org Artificial Intelligence

Mar-10-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Germany
  - Bavaria > Upper Bavaria > Munich (0.04)
- Asia > China
  - Shanghai > Shanghai (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found