74dbd1111727a31a2b825d615d80b2e7-Supplemental.pdf

Neural Information Processing Systems 

Recent empirical successes in large-scale machine learning have been powered by massive data parallelism and hardware acceleration, with batch sizes trending beyond 10K+ images [46] or 1M+ tokens [9]. Numerous interdisciplinarysources [5,12,24,33]indicate that the performance bottlenecks of contemporary deep learning pipelines can lie in many places other than gradient computation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found