On the Burstiness of Distributed Machine Learning Traffic