On Inductive Biases That Enable Generalization of Diffusion Transformers
–Neural Information Processing Systems
Recent work studying the generalization of diffusion models with locally linear UNet-based denoisers reveals inductive biases that can be expressed via geometryadaptive harmonic bases. For such locally linear UNets, these geometry-adaptive harmonic bases can be conveniently visualized through the eigen-decomposition of a UNet's Jacobian matrix. In practice, however, more recent denoising networks are often transformer-based, e.g., the diffusion transformer (DiT). Due to the presence of nonlinear operations, similar eigen-decomposition analyses cannot be used to reveal the inductive biases of transformer-based denoisers. This motivates our search for alternative ways to explain the strong generalization ability observed in DiT models.
Neural Information Processing Systems
Jun-15-2026, 23:31:45 GMT