Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Yuan, Yuan, Shao, Chenyang, Ding, Jingtao, Jin, Depeng, Li, Yong

Mar-25-2024–arXiv.org Artificial Intelligence

Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. Spatio-temporal prediction is a fundamental problem in various smart city applications (Xia et al., 2024; Zhou et al., 2024; Wang et al., 2023a;c;b). Many deep learning models are proposed to solve this problem, whose successes however rely on large-scale spatio-temporal data. Due to imbalanced development levels and different data collection policies, urban spatio-temporal data, such as traffic and crowd flow data, are usually limited in many cities and regions. Under these circumstances, the model's transferability under data-scarce scenarios is of pressing importance. To address this issue, various transfer learning approaches have emerged for spatio-temporal modeling. Their primary goal is to leverage knowledge and insights gained from one or multiple source cities and apply them effectively to a target city. These approaches can be broadly classified into two main categories. However, existing fine-grained methods largely rely on elaborated matching designs, such as utilizing auxiliary data for similarity calculation (Wang et al., 2019) or incorporating multi-task learning to obtain implicit representations (Lu et al., 2022). How to enable a more general knowledge transfer to automated retrieving similar characteristics across source and target cities still remains unsolved. Recently, pre-trained models have yielded significant breakthroughs in the fields of Natural Language Processing (NLP) (Brown et al., 2020; Vaswani et al., 2017). Prompting techniques are also introduced to reduce the gap between fine-tuning and pre-training (Brown et al., 2020).

conference paper, dataset, prediction, (16 more...)

arXiv.org Artificial Intelligence

Mar-25-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - District of Columbia > Washington (0.05)
    - New York (0.04)
    - Virginia > Arlington County (0.04)
    - Maryland > Baltimore County (0.04)
    - California > Los Angeles County (0.04)
  - Trinidad and Tobago > Trinidad
    - Arima > Arima (0.05)
- Asia > China
  - Sichuan Province > Chengdu (0.05)
  - Guangdong Province > Shenzhen (0.04)
  - Heilongjiang Province > Daqing (0.04)
  - Beijing > Beijing (0.04)

Genre:
- Research Report > Promising Solution (0.48)

Industry:
- Transportation (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)