Generalization and Optimization of SGD with Lookahead