A Closer Look at the Generalization Gap in Large Batch Training of Neural Networks