On the Diversity of Synthetic Data and its Impact on Training Large Language Models

Open in new window