Reviews: Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning
–Neural Information Processing Systems
The relationship between the proposed pipeline parallel optimization setting and existing work is not clear. Does it contain related work as special cases? The authors mentioned in the abstract that the presented study is distributed per-layer instead of per-sample. It could be helpful to give additional comparison along this line. This was briefly touched in Section 2 on asynchronous value/gradient evaluation.
parallel optimization and application, pipeline parallel optimization, theoretical limit, (5 more...)
Neural Information Processing Systems
Feb-5-2025, 07:27:00 GMT
- Technology: