SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum