Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Yuan, Yuan, Shao, Chenyang, Ding, Jingtao, Jin, Depeng, Li, Yong
–arXiv.org Artificial Intelligence
Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. Spatio-temporal prediction is a fundamental problem in various smart city applications (Xia et al., 2024; Zhou et al., 2024; Wang et al., 2023a;c;b). Many deep learning models are proposed to solve this problem, whose successes however rely on large-scale spatio-temporal data. Due to imbalanced development levels and different data collection policies, urban spatio-temporal data, such as traffic and crowd flow data, are usually limited in many cities and regions. Under these circumstances, the model's transferability under data-scarce scenarios is of pressing importance. To address this issue, various transfer learning approaches have emerged for spatio-temporal modeling. Their primary goal is to leverage knowledge and insights gained from one or multiple source cities and apply them effectively to a target city. These approaches can be broadly classified into two main categories. However, existing fine-grained methods largely rely on elaborated matching designs, such as utilizing auxiliary data for similarity calculation (Wang et al., 2019) or incorporating multi-task learning to obtain implicit representations (Lu et al., 2022). How to enable a more general knowledge transfer to automated retrieving similar characteristics across source and target cities still remains unsolved. Recently, pre-trained models have yielded significant breakthroughs in the fields of Natural Language Processing (NLP) (Brown et al., 2020; Vaswani et al., 2017). Prompting techniques are also introduced to reduce the gap between fine-tuning and pre-training (Brown et al., 2020).
arXiv.org Artificial Intelligence
Mar-25-2024
- Country:
- Asia > China
- Beijing > Beijing (0.04)
- Guangdong Province > Shenzhen (0.04)
- Heilongjiang Province > Daqing (0.04)
- Sichuan Province > Chengdu (0.05)
- North America
- Trinidad and Tobago > Trinidad
- United States
- California > Los Angeles County (0.04)
- District of Columbia > Washington (0.05)
- Maryland > Baltimore County (0.04)
- New York (0.04)
- Virginia > Arlington County (0.04)
- Asia > China
- Genre:
- Research Report > Promising Solution (0.48)
- Industry:
- Transportation (0.46)
- Technology: