SAPipe: Staleness-Aware Pipeline for Data Parallel DNN Training
–Neural Information Processing Systems
Data parallelism across multiple machines is widely adopted for accelerating distributed deep learning, but it is hard to achieve linear speedup due to the heavy communication. In this paper, we propose SAPipe, a performant system that pushes the training speed of data parallelism to its fullest extent.
Neural Information Processing Systems
Dec-24-2025, 11:01:23 GMT
- Technology: