SGD with shuffling: optimal rates without component convexity and large epoch requirements

Open in new window