Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math