Emergent Temporal Correspondences from Video Diffusion Transformers
–Neural Information Processing Systems
Dif trac fT king rack annotations and proposes novel evaluation metrics to systematically analyze how each component within the full 3D attention mechanism of DiTs (e.g., representa-tiontions, layers, and timesteps) contributes to establishing temporal correspondences.
Neural Information Processing Systems
Jun-22-2026, 01:12:40 GMT
- Country:
- Europe (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Media (0.46)
- Technology:
- Information Technology
- Communications (0.67)
- Artificial Intelligence
- Vision (1.00)
- Machine Learning > Neural Networks (1.00)
- Natural Language > Large Language Model (0.69)
- Representation & Reasoning (0.67)
- Information Technology