LyAm: Robust Non-Convex Optimization for Stable Learning in Noisy Environments
Mirzabeigi, Elmira, Rezaee, Sepehr, Parand, Kourosh
–arXiv.org Artificial Intelligence
Training deep neural networks for computer vision is inherently challenging due to issues like unstable gradients, local minima, and pervasive noisy data [48]. These challenges are magnified in anomalous environments where data distributions deviate from the norm, critically impairing the optimization process. Such instability hinders the model's ability to learn robust representations and significantly affects its generalization to unseen data. The choice of optimizer is central to alleviating these issues, as it governs both convergence speed and stability during training. Over the decades, various optimizers have been proposed to tackle different facets of this optimization challenge. Early work on Stochastic Gradient Descent (SGD) [30, 33] laid the foundation for iterative gradient-based methods by employing a simple yet effective parameter update scheme. AdaGrad [4] introduced per-parameter learning rate adjustments to better handle sparse gradients, while Adam [13] fused momentum-based updates with adaptive learning rates, accelerating convergence. Subsequently, Adam variants such as AdamW [23], AdaBelief [52], and Adan [21] have sought to address limitations in Adam's adaptive mechanism and enhance robustness in complex, non-convex landscapes.
arXiv.org Artificial Intelligence
Jul-16-2025
- Country:
- Asia
- Middle East > Iran
- Tehran Province > Tehran (0.04)
- Russia (0.04)
- Thailand > Phuket
- Phuket (0.04)
- Middle East > Iran
- Europe > Russia (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Massachusetts > Middlesex County
- Canada > Ontario
- Asia
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Technology: