When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Open in new window