Federated Language Models Under Bandwidth Budgets: Distillation Rates and Conformal Coverage