Independently-Normalized SGD for Generalized-Smooth Nonconvex Optimization
Yang, Yufeng, Tripp, Erin, Sun, Yifan, Zou, Shaofeng, Zhou, Yi
Recent studies have shown that many nonconvex machine learning problems meet a so-called generalized-smooth condition that extends beyond traditional smooth nonconvex optimization. However, the existing algorithms designed for generalized-smooth nonconvex optimization encounter significant limitations in both their design and convergence analysis. In this work, we first study deterministic generalized-smooth nonconvex optimization and analyze the convergence of normalized gradient descent under the generalized Polyak-Lojasiewicz condition. Our results provide a comprehensive understanding of the interplay between gradient normalization and function geometry. Then, for stochastic generalized-smooth nonconvex optimization, we propose an independently-normalized stochastic gradient descent algorithm, which leverages independent sampling, gradient normalization and clipping to achieve an $\mathcal{O}(\epsilon^{-4})$ sample complexity under relaxed assumptions. Experiments demonstrate the fast convergence of our algorithm.
Oct-17-2024
- Country:
- North America > United States
- Arizona (0.04)
- Texas > Brazos County
- College Station (0.04)
- New York > Suffolk County
- Stony Brook (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Education (0.34)
- Technology: