Goto

Collaborating Authors

 Statistical Learning


SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator

Neural Information Processing Systems

We provide a few error-bound results on its convergence rates. Specially, we prove that theSPIDER-SFO algorithm achieves a gradient computation cost of O min(n1/2 2, 3) to find an -approximate first-order stationary point. In addition, we prove thatSPIDER-SFO nearly matches the algorithmic lower bound for finding stationary point under the gradient Lipschitz assumption in the finite-sum setting.


Semi-Supervised Domain Generalization with Known and Unknown Classes

Neural Information Processing Systems

Semi-Supervised Domain Generalization (SSDG) aims to learn a model that is generalizable to an unseen target domain with only a few labels, and most existing SSDG methods assume that unlabeled training and testing samples are all known classes. However, a more realistic scenario is that known classes may be mixed with some unknown classes in unlabeled training and testing data.






The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies

Neural Information Processing Systems

We study the relationship between the frequency of a function and the speed at which a neural network learns it. We build on recent results that show that the dynamics of overparameterized neural networks trained with gradient descent can bewell approximated byalinear system. When normalized training data is uniformly distributed on ahypersphere, the eigenfunctions of this linear system are spherical harmonic functions.