Goto

Collaborating Authors

 gradient descent


One-step differentiation of iterative algorithms

Neural Information Processing Systems

For iterative algorithms, implicit differentiation alleviates this issue but requires custom implementation of Jacobian evaluation. In this paper, we study one-step differentiation, also known as Jacobian-free backpropagation, a method as easy as automatic differentiation and as efficient as implicit differentiation for fast algorithms (e.g., superlinear



On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen

Neural Information Processing Systems

Stochastic gradient descent (SGD) algorithm is the method of choice in many machine learning tasks thanks to its scalability and efficiency in dealing with large-scale problems. In this paper, we focus on the shuffling version of SGD which matches the mainstream practical heuristics. We show the convergence to a global solution of shuffling SGD for a class of non-convex functions under over-parameterized settings.






Tight Risk Bounds for Gradient Descent on Separable Data

Neural Information Processing Systems

Recently, there has been a marked increase in interest regarding the generalization capabilities of unregularized gradient-based learning methods.


Tight Risk Bounds for Gradient Descent on Separable Data

Neural Information Processing Systems

Recently, there has been a marked increase in interest regarding the generalization capabilities of unregularized gradient-based learning methods.