From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning