Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees