When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers