Stochastic Gradient Descent with Large Learning Rate