The Fine-Grained Complexity of Gradient Computation for Training Large Language Models

Open in new window