Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

Neural Information Processing Systems 

Core to our analysis is the reuse of minibatch in the gradient computation, which gives rise to higher-order information beyond correlational queries.