AYLA: Amplifying Gradient Sensitivity via Loss Transformation in Non-Convex Optimization

Jun-25-2025–arXiv.org Artificial Intelligence

Stochastic Gradient Descent (SGD) and its variants, such as ADAM, are foundational to deep learning optimization, adjusting model parameters through fixed or adaptive learning rates based on loss function gradients. However, these methods often struggle to balance adaptability and efficiency in high-dimensional, non-convex settings. This paper introduces AYLA, a novel optimization framework that enhances training dynamics via loss function transformation. AYLA applies a tunable power-law transformation to the loss, preserving critical points while scaling loss values to amplify gradient sensitivity and accelerate convergence. Additionally, we propose an effective learning rate that dynamically adapts to the transformed loss, further improving optimization efficiency. Empirical evaluations on minimizing a synthetic non-convex polynomial, solving a non-convex curve-fitting task, and performing digit classification (MNIST) and image recognition (CIFAR-100) demonstrate that AYLA consistently outperforms SGD and ADAM in both convergence speed and training stability. By reshaping the loss landscape, AYLA provides a model-agnostic enhancement to existing optimization methods, offering a promising advancement in deep neural network training.

artificial intelligence, ayla, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jun-25-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.04)
- Asia
  - Russia (0.04)
  - India (0.04)

Genre:
- Research Report (0.83)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Statistical Learning > Gradient Descent (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found