Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning