On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent