Goto

Collaborating Authors

 Jinkun Geng



BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training

Neural Information Processing Systems

In distributed machine learning (DML), the network performance between machines significantly impacts the speed of iterative training. In this paper we propose BML, a new gradient synchronization algorithm with higher network performance and lower network cost than the current practice. BML runs on BCube network, instead of using the traditional Fat-Tree topology.