Understanding the Implicit Regularization of Gradient Descent in Over-parameterized Models