Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent

Open in new window