Goto

Collaborating Authors

 stepsize







Adaptive Proximal Gradient Method for Convex Optimization

Neural Information Processing Systems

In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions.




Scaling Laws in Linear Regression: Compute, Parameters, and Data

Neural Information Processing Systems

From the perspective of statistical learning theory, (1) is rather intriguing. Moreover, they do not provide instance-wise matching lower bounds to verify the tightness of the upper bounds.