How to derive ring all-reduce's mathematical property step by step

Jun-16-2022, 13:47:00 GMT–#artificialintelligence

In our previous blog: Combating Software System Complexity: Appropriate Abstraction Layer, we mentioned that the communication in a distributed deep learning framework is highly dependent on regular collective communication operations like all-reduce, reduce-scatter, all-gather, and so on. Therefore, it's crucial to implement a highly optimized collective communication and select an ideal algorithm based on task requirements and communication typology. This article will unveil the mathematical property of collective communication operations by analyzing the case of all-reduce, which is common in data parallelism. As illustrated in Figure 1, there are four devices, each with one matrix (to keep things simple, each row in these matrices has only one element). And all-reduce is an operation that sums up the same row's input value across devices and returns the resultant value to the corresponding row.

algorithm, communication, opération, (14 more...)

#artificialintelligence

Jun-16-2022, 13:47:00 GMT

News Web Page

Add feedback

Genre:
- Workflow (0.43)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found