Automating model parallelism with just one line of code
Researchers from Google, Amazon Web Services, UC Berkeley, Shanghai Jiao Tong University, Duke University and Carnegie Mellon University have published a paper titled "Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning" at OSDI 2022. The paper introduces a new method for automating the complex process of parallelising a model with only one line of code. So how does Alpa work? Data parallelism is a technique where model weights are duplicated across accelerators while only partitioning and distributing the training data. The dataset is split into'N' parts in data parallelism with'N' being the quantity of GPUs.
May-25-2022, 01:24:53 GMT
- Technology: