A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond

Neural Information Processing Systems 

We aim to provide a unified convergence analysis for permutation-based Stochastic Gradient Descent (SGD), where data examples are permuted before each epoch.