Exploring Forgetting in Large Language Model Pre-Training