Stochastic gradient descent introduces an effective landscape-dependent regularization favoring flat solutions