Adaptive Learning of the Optimal Mini-Batch Size of SGD