Diffusion Transformers as Open-World Spatiotemporal Foundation Models
–Neural Information Processing Systems
The urban environment is characterized by complex spatio-temporal dynamics arising from diverse human activities and interactions. Effectively modeling these dynamics is essential for understanding and optimizing urban systems. In this work, we introduce UrbanDiT, a foundation model for open-world urban spatio-temporal learning that successfully scales up diffusion transformers in this field.
Neural Information Processing Systems
Jun-11-2026, 09:22:10 GMT
- Technology: