Parameter-free Clipped Gradient Descent Meets Polyak Y uki T akezawa
–Neural Information Processing Systems
Gradient descent and its variants are de facto standard algorithms for training machine learning models. As gradient descent is sensitive to its hyperparame-ters, we need to tune the hyperparameters carefully using a grid search.
Neural Information Processing Systems
Feb-12-2026, 21:47:00 GMT