Scaling with Collapse: Efficient and Predictable Training of LLM Families

Open in new window