Emergent Temporal Correspondences from Video Diffusion Transformers

Neural Information Processing Systems 

Dif trac fT king rack annotations and proposes novel evaluation metrics to systematically analyze how each component within the full 3D attention mechanism of DiTs (e.g., representa-tiontions, layers, and timesteps) contributes to establishing temporal correspondences.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found