apex library
Distributed Training in Deep Learning using PyTorch: A Handy Tutorial
PyTorch has built-in packages which support distributed training. There are two approaches for running a distributed training in PyTorch. DDP always trains models faster than DP; however, it requires more lines of code change to the single-GPU code, namely, code change for the model, optimizer, and the backpropagation step. Based on our experience, the good news is that DDP could save a significant amount of train time by utilizing all GPUs at almost 100% of memory usage across multiple nodes. In the following paragraphs, we elaborate on how to use DP and DDP by providing an example for each method.
Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)