Near-Optimal Sparse Allreduce for Distributed Deep Learning