Escaping Collapse: The Strength of Weak Data for Large Language Model Training