Improving training time and GPU utilization in geo-distributed language model training

Open in new window