AdamZ: An Enhanced Optimisation Method for Neural Network Training

Zaznov, Ilia, Badii, Atta, Dufour, Alfonso, Kunkel, Julian

arXiv.org Machine Learning 

In recent years, the field of machine learning has witnessed significant advancements, particularly in the development of optimisation algorithms that enhance the efficiency and effectiveness of training deep neural networks. Among these algorithms, the Adam optimiser has gained widespread popularity due to its adaptive learning rate capabilities, which enable more efficient convergence compared to traditional methods such as stochastic gradient descent. However, despite its advantages, Adam is not without its limitations, particularly when it comes to handling issues such as overshooting and stagnation during the training process. To address these challenges, we introduce AdamZ as an advanced variant of the Adam optimiser. AdamZ is specifically designed to dynamically adjust the learning rate responsive to the characteristics of the loss function, thereby improving both convergence stability and model accuracy. This novel optimiser integrates mechanisms to detect and mitigate overshooting, at the point where the optimiser has stepped too far into the parameter space, and stagnation at points, where progress has started to stall despite ongoing training. By introducing hyperparameters such as overshoot and stagnation factors, thresholds, and patience levels, AdamZ provides a more responsive approach to learning rate adaptation than obtained through Adam.