Fast Distributed Training of Deep Neural Networks: Dynamic Communication Thresholding for Model and Data Parallelism