Review for NeurIPS paper: AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning
–Neural Information Processing Systems
Additional Feedback: Section 3.1 Equation (2) I believe p is missing in r {\Pi}_{i,k} . The example of 7 days and 2200 AWS credits saving should be given in the context of the full cost. In subsection'Search space evaluation' I don't understand how 42% for VGG16 and 28.5% can be considered as a large positive hit rate. They way I understand it it means 58% and 71.5% of the strategies were worst than hand-optimized baselines. Why no hit-rate in Figure 3? Table 3 would be more informative in terms of improvement measures.
Neural Information Processing Systems
Jan-21-2025, 09:47:32 GMT
- Technology: