Quantized Distributed Training of Large Models with Convergence Guarantees