AITopics | prh

Collaborating Authors

prh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proof of a perfect platonic representation hypothesis

Ziyin, Liu, Chuang, Isaac

arXiv.org Machine LearningJul-3-2025

In this note, we elaborate on and explain in detail the proof given by Ziyin et al. (2025) of the "perfect" Platonic Representation Hypothesis (PRH) for the embedded deep linear network model (EDLN). We show that if trained with SGD, two EDLNs with different widths and depths and trained on different data will become Perfectly Platonic, meaning that every possible pair of layers will learn the same representation up to a rotation. Because most of the global minima of the loss function are not Platonic, that SGD only finds the perfectly Platonic solution is rather extraordinary. The proof also suggests at least six ways the PRH can be broken. We also show that in the EDLN model, the emergence of the Platonic representations is due to the same reason as the emergence of progressive sharpening. This implies that these two seemingly unrelated phenomena in deep learning can, surprisingly, have a common cause. Overall, the theory and proof highlight the importance of understanding emergent "entropic forces" due to the irreversibility of SGD training and their role in representation learning. The goal of this note is to be instructive and avoid lengthy technical details.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Machine Learning

2507.01098

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Generalisation in Feedforward Networks

Kowalczyk, Adam, Ferrá, Herman L.

Neural Information Processing SystemsDec-31-1995

They provide in particular some theoretical bounds on the sample complexity, i.e. a minimal number of training samples assuring the desired accuracy with the desired confidence. However there are a few obvious deficiencies in these results: (i) the sample complexity bounds are unrealistically high (c.f. Section 4.), and (ii) for some networks they do not hold at all since VC-dimension is infinite, e.g.

prh, sample complexity, vc-dimension, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalisation in Feedforward Networks

Kowalczyk, Adam, Ferrá, Herman L.

Neural Information Processing SystemsDec-31-1995

prh, sample complexity, vc-dimension, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalisation in Feedforward Networks

Kowalczyk, Adam, Ferrá, Herman L.

Neural Information Processing SystemsDec-31-1995

We show that the model provides asignificant improvement on the upper bounds of sample complexity, i.e. the minimal number of random training samples allowing a selection of the hypothesis with a predefined accuracy and confidence. Further, we show that the model has the potential forproviding a finite sample complexity even in the case of infinite VC-dimension as well as for a sample complexity below VC-dimension. This is achieved by linking sample complexity to an "average" number of implementable dichotomies of a training sample rather than the maximal size of a shattered sample, i.e. VC-dimension. 1 Introduction A number offundamental results in computational learning theory [1, 2, 11] links the generalisation error achievable by a set of hypotheses with its Vapnik-Chervonenkis dimension (VC-dimension, for short) which is a sort of capacity measure. They provide in particular some theoretical bounds on the sample complexity, i.e. a minimal number of training samples assuring the desired accuracy with the desired confidence. However there are a few obvious deficiencies in these results: (i) the sample complexity bounds are unrealistically high (c.f. Section 4.), and (ii) for some networks they do not hold at all since VC-dimension is infinite, e.g.

artificial intelligence, machine learning, sample complexity, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback