GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, zhifeng Chen
–Neural Information Processing Systems
Inmany cases, increasing model capacity beyond the memory limit of a single acceleratorhas required developing special algorithms orinfrastructure. These solutions are often architecture-specific and do not transfer to other tasks.
Neural Information Processing Systems
Feb-11-2026, 10:05:42 GMT
- Country:
- Technology: