Diffusion Transformers as Open-World Spatiotemporal Foundation Models
–Neural Information Processing Systems
The urban environment is characterized by complex spatio-temporal dynamics arising from diverse human activities and interactions. Effectively modeling these dynamics is essential for understanding and optimizing urban systems. In this work, we introduce UrbanDiT, a foundation model for open-world urban spatiotemporal learning that successfully scales up diffusion transformers in this field.
Neural Information Processing Systems
Jun-15-2026, 22:04:00 GMT
- Genre:
- Overview (0.93)
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Industry:
- Transportation (0.67)
- Information Technology (0.46)
- Technology:
- Information Technology
- Data Science > Data Mining (1.00)
- Communications (0.93)
- Artificial Intelligence
- Vision (1.00)
- Representation & Reasoning (1.00)
- Natural Language > Large Language Model (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (0.93)
- Information Technology