AITopics | correlation trap

Collaborating Authors

correlation trap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Grokking and Generalization Collapse: Insights from \texttt{HTSR} theory

Prakash, Hari K., Martin, Charles H.

arXiv.org Machine LearningJun-6-2025

We study the well-known grokking phenomena in neural networks (NNs) using a 3-layer MLP trained on 1 k-sample subset of MNIST, with and without weight decay, and discover a novel third phase -- \emph{anti-grokking} -- that occurs very late in training and resembles but is distinct from the familiar \emph{pre-grokking} phases: test accuracy collapses while training accuracy stays perfect. This late-stage collapse is distinct, from the known pre-grokking and grokking phases, and is not detected by other proposed grokking progress measures. Leveraging Heavy-Tailed Self-Regularization HTSR through the open-source WeightWatcher tool, we show that the HTSR layer quality metric $α$ alone delineates all three phases, whereas the best competing metrics detect only the first two. The \emph{anti-grokking} is revealed by training for $10^7$ and is invariably heralded by $α< 2$ and the appearance of \emph{Correlation Traps} -- outlier singular values in the randomized layer weight matrices that make the layer weight matrix atypical and signal overfitting of the training set. Such traps are verified by visual inspection of the layer-wise empirical spectral densities, and by using Kolmogorov--Smirnov tests on randomized spectra. Comparative metrics, including activation sparsity, absolute weight entropy, circuit complexity, and $l^2$ weight norms track pre-grokking and grokking but fail to distinguish grokking from anti-grokking. This discovery provides a way to measure overfitting and generalization collapse without direct access to the test data. These results strengthen the claim that the \emph{HTSR} $α$ provides universal layer-convergence target at $α\approx 2$ and underscore the value of using the HTSR alpha $(α)$ metric as a measure of generalization.

artificial intelligence, correlation trap, machine learning, (15 more...)

arXiv.org Machine Learning

2506.04434

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

How to Determine if Your Machine Learning Model is Overtrained - KDnuggets

#artificialintelligenceMay-26-2021, 00:02:08 GMT

The weightwatcher tool can detect the signatures of overtraining in specific layers of a pre/trained Deep Neural Networks. In the Figure above, fig (a) is well trained, whereas fig (b) may be over-trained. That orange spike on the far right is the tell-tale clue; it's what we call a Correlation Trap. Weightwatcher can detect the signatures of overtraining in specific layers of a pre/trained Deep Neural Networks. In this post, we show how to use the weightwatcher tool to do this.

deep neural network, eigenvalue, esd, (14 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.17)

Industry: Education (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

is your model overtrained ?

#artificialintelligenceApr-5-2021, 19:00:18 GMT

Are your nodels over-trained ? The weightwatcher tool can detect the signatures of overtraining in specific layers of a pre/trained Deep Neural Networks. In the Figure abiove, fig (a) is well trained, whereas fig (b) may be over-trained. That orange spike on the far right is the tell-tale clue; it's what we call a Correlation…

deep neural network, eigenvalue, esd, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback