Activations and Gradients Compression for Model-Parallel Training