Weak-to-Strong Generalization Through the Data-Centric Lens