Goto

Collaborating Authors

 Gradient Descent


Adaptive Proximal Gradient Method for Convex Optimization

Neural Information Processing Systems

In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions.





Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates

Neural Information Processing Systems

In particular, we establish the surprising result that: F or any constant learning rate ฮท > 0, the stochastic gradient bandit algorithm is guaranteed to converge to the globally optimal policy almost surely.