B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data

Neural Information Processing Systems 

Assumption 1.1 ensures that the gradient estimator