PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
Ockerman, Seth, Gueroudji, Amal, Mallick, Tanwi, He, Yixuan, Pouchard, Line, Ross, Robert, Venkataraman, Shivaram
–arXiv.org Artificial Intelligence
Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. However, their applications have been limited primarily to small-scale datasets because of memory constraints. While distributed training offers a solution, current frameworks lack support for spatiotemporal models and overlook the properties of spatiotemporal data. Informed by a scaling study on a large-scale workload, we present PyTorch Geometric Temporal Index (PGT-I), an extension to PyTorch Geometric Temporal that integrates distributed data parallel training and two novel strategies: index-batching and distributed-index-batching. Our index techniques exploit spatiotemporal structure to construct snapshots dynamically at runtime, significantly reducing memory overhead, while distributed-index-batching extends this approach by enabling scalable processing across multiple GPUs. Our techniques enable the first-ever training of an ST-GNN on the entire PeMS dataset without graph partitioning, reducing peak memory usage by up to 89% and achieving up to a 11.78x speedup over standard DDP with 128 GPUs.
arXiv.org Artificial Intelligence
Sep-17-2025
- Country:
- Asia
- China (0.04)
- Macao (0.04)
- Middle East > Jordan (0.04)
- Europe
- North America
- Canada > British Columbia
- United States
- Arizona > Maricopa County
- Phoenix (0.04)
- California > San Diego County
- San Diego (0.04)
- Colorado > Denver County
- Denver (0.04)
- District of Columbia > Washington (0.04)
- Illinois > Cook County
- Lemont (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- New York > New York County
- New York City (0.04)
- Wisconsin > Dane County
- Madison (0.14)
- Arizona > Maricopa County
- Asia
- Genre:
- Research Report > Promising Solution (0.93)
- Industry:
- Energy > Renewable (0.67)
- Government > Regional Government
- Health & Medicine (0.94)
- Technology: