AITopics | plateau phenomenon

Collaborating Authors

plateau phenomenon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Dimensional Kernel Ridge Regression: Extending to Product Kernels

Zhou, Yang, Li, Yicheng, Cheng, Yuqian, Lin, Qian

arXiv.org Machine LearningMay-15-2026

Recent studies have reported $\textit{saturation effects}$ and $\textit{multiple descent behavior}$ in large dimensional kernel ridge regression (KRR). However, these findings are predominantly derived under restrictive settings, such as inner product kernels on sphere or strong eigenfunction assumptions like hypercontractivity. Whether such behaviors hold for other kernels remains an open question. In this paper, we establish a broad, new family of large dimensional kernels and derive the corresponding convergence rates of the generalization error. As a result, we recover key phenomena previously associated with inner product kernels on sphere, including: $i)$ the $\textit{minimax optimality}$ when the source condition $s\le 1$; $ii)$ the $\textit{saturation effect}$ when $s>1$; $iii)$ a $\textit{periodic plateau phenomenon}$ in the convergence rate and a $\textit {multiple-descent behavior}$ with respect to the sample size $n$.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

2605.14524

Country: North America (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Yuki Yoshida, Masato Okada

Neural Information Processing SystemsFeb-11-2026, 18:07:20 GMT

Then the phenomenon has been thought as inevitable. However, the phenomenon seldom occurs in the context of recent deep learning. There is a gap between theory and reality.

artificial intelligence, machine learning, perceptron, (17 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.31)

Add feedback

287e03db1d99e0ec2edb90d079e142f3-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 18:07:05 GMT

eigenvalue, order parameter, plateau phenomenon, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing SystemsDec-25-2025, 03:53:03 GMT

The plateau phenomenon, wherein the loss value stops decreasing during the process of learning, has been reported by various researchers. The phenomenon is actively inspected in the 1990s and found to be due to the fundamental hierarchical structure of neural network models. Then the phenomenon has been thought as inevitable. However, the phenomenon seldom occurs in the context of recent deep learning. There is a gap between theory and reality. In this paper, using statistical mechanical formulation, we clarified the relationship between the plateau phenomenon and the statistical property of the data learned. It is shown that the data whose covariance has small and dispersed eigenvalues tend to make the plateau phenomenon inconspicuous.

neural network, plateau phenomenon, statistical mechanical analysis, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Yuki Yoshida, Masato Okada

Neural Information Processing SystemsOct-2-2025, 10:12:18 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, order parameter, (14 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.15)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.31)

Add feedback

287e03db1d99e0ec2edb90d079e142f3-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 10:12:04 GMT

artificial intelligence, order parameter, plateau phenomenon, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

Reviews: Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing SystemsJan-22-2025, 10:48:22 GMT

It would make more sense to show results for data with low-dimensional structure, in which the first one or two are non-zero, and the rest are either zero or epsilon small. Do the conclusions for the two eigenvalues case still hold in this example? It is hard for me to see what I should learn from figures 5 and 6. - The dependence of the learning dynamics on the spectral properties of the input data is not new and was previously studies by Saxe et al. (ArXiv, 2013) for simple linear networks. It would be appropriate if these results were mentioned or discussed in the text. It has been previously showed that the initial conditions have a big impact on the trainability and learning dynamics of these networks. In this case, they would be defined as the initial conditions on the order parameters Q, R, and D. - The analysis here seems tractable only for networks with a small number of hidden units.

initial condition, plateau phenomenon, statistical mechanical analysis, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Reviews: Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing SystemsJan-22-2025, 10:48:11 GMT

This paper provides an analysis on dynamics of online learning of two-layer neural networks under the teacher-student scenario. The analysis extends that by Saad and Solla (1995) by considering a covariance matrix of the input which may not be proportional to the identity matrix. The main contribution of this paper is the finding that the plateau phenomenon observed in learning dynamics of nonlinear neural networks depends on statistics of input data. The three reviewers rated this paper above the acceptance threshold, mentioning originality and importance of the contribution of this paper. At the same time, two reviewers raised concern about clarity of presentation.

initial condition, plateau phenomenon, statistical mechanical analysis, (12 more...)

Neural Information Processing Systems

Industry: Education (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Neural Information Processing SystemsOct-9-2024, 17:22:10 GMT

neural network, plateau phenomenon, statistical mechanical analysis, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Noise-induced degeneration in online learning

Sato, Yuzuru, Tsutsui, Daiji, Fujiwara, Akio

arXiv.org Machine LearningAug-27-2020

The gradient descent is the simplest optimisation algorithm represented by gradient dynamics in a potential. When the input data is finite, gradient descent dynamics fluctuates due to the finite size effects, and is called stochastic gradient descent. In this paper, we study stability of stochastic gradient descent dynamics from the viewpoint of dynamical systems theory. Learning is characterised as nonautonomous dynamics driven by uncertain input from the external, and as multi-scale dynamics which consists of slow memory dynamics and fast system dynamics. When the uncertain input sequences are modelled by stochastic processes, dynamics of learning is described by a random dynamical system. In contrast to the traditional Fokker-Planck approaches [5, 15], the random dynamical system approaches enable the study not only of stationary distributions and global statistics, but also of the pathwise structure of stochastic dynamics. Based on nonautonomous and random dynamical system theory, it is possible to analyse stability and bifurcation in machine learning.

artificial intelligence, gradient descent, machine learning, (16 more...)

arXiv.org Machine Learning

2008.10498

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback