EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models

Open in new window