Scalable and Consistent Graph Neural Networks for Distributed Mesh-based Data-driven Modeling

Barwey, Shivam, Balin, Riccardo, Lusch, Bethany, Patel, Saumil, Balakrishnan, Ramesh, Pal, Pinaki, Maulik, Romit, Vishwanath, Venkatram

Oct-2-2024–arXiv.org Artificial Intelligence

This work develops a distributed graph neural network (GNN) methodology for mesh-based modeling applications using a consistent neural message passing layer. As the name implies, the focus is on enabling scalable operations that satisfy physical consistency via halo nodes at sub-graph boundaries. Here, consistency refers to the fact that a GNN trained and evaluated on one rank (one large graph) is arithmetically equivalent to evaluations on multiple ranks (a partitioned graph). This concept is demonstrated by interfacing GNNs with NekRS, a GPU-capable exascale CFD solver developed at Argonne National Laboratory. It is shown how the NekRS mesh partitioning can be linked to the distributed GNN training and inference routines, resulting in a scalable mesh-based data-driven modeling workflow. We study the impact of consistency on the scalability of mesh-based GNNs, demonstrating efficient scaling in consistent GNNs for up to O(1B) graph nodes on the Frontier exascale supercomputer.

artificial intelligence, machine learning, node, (20 more...)

arXiv.org Artificial Intelligence

Oct-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report (0.64)

Industry:
- Energy (1.00)
- Government > Regional Government
  - North America Government > United States Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)