On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

Open in new window