DiLoCo: Distributed Low-Communication Training of Language Models

Open in new window