Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Wei, Wenbo, Le, Nicholas Chong Jia, Lai, Choy Heng, Feng, Ling

arXiv.org Artificial Intelligence 

In deep learning, understanding the training dynamics has become paramount for enhancing model performance, generalization, and robustness. The training of deep neural networks involves navigating through complex, high-dimensional parameter spaces, where the interplay between model complexity, dataset characteristics, and learning algorithms dictates the learning trajectory. This process is far from straightforward, often characterized by phenomena such as overfitting, under-fitting, and various forms of descent in performance metrics. The dynamics of training deep neural networks are critical for several reasons. Generalization is a primary concern in machine learning, focusing on the model's ability to generalize from training data to unseen data.