AITopics | embedding principle

Collaborating Authors

embedding principle

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Embedding Principle of Loss Landscape of Deep Neural Networks

Neural Information Processing SystemsDec-24-2025, 08:43:22 GMT

Understanding the structure of loss landscape of deep neural networks (DNNs) is obviously important. In this work, we prove an embedding principle that the loss landscape of a DNN contains all the critical points of all the narrower DNNs. More precisely, we propose a critical embedding such that any critical point, e.g., local or global minima, of a narrower DNN can be embedded to a critical point/affine subspace of the target DNN with higher degeneracy and preserving the DNN output function. Note that, given any training data, differentiable loss function and differentiable activation function, this embedding structure of critical points holds.This general structure of DNNs is starkly different from other nonconvex problems such as protein-folding.Empirically, we find that a wide DNN is often attracted by highly-degenerate critical points that are embedded from narrow DNNs. The embedding principle provides a new perspective to study the general easy optimization of wide DNNs and unravels a potential implicit low-complexity regularization during the training.Overall, our work provides a skeleton for the study of loss landscape of DNNs and its implication, by which a more exact and comprehensive understanding can be anticipated in the near future.

embedding principle, loss landscape, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Supplement to: Embedding Principle of Loss Landscape of Deep Neural Networks

Neural Information Processing SystemsAug-15-2025, 09:50:03 GMT

However, this transform does not inform about the degeneracy of critical points/manifolds. Clearly, this transform is also a critical transform. For the 1D fitting experiments (Figs. 1, 3(a), 4), we use tanh as the activation function, mean squared We use the full-batch gradient descent with learning rate 0.005 to We use the default Adam optimizer of full batch with learning rate 0.02 to train for We also use the default Adam optimizer of full batch with learning rate 0.00003 Their output functions are shown in the figure. Remark that, although Figs. 1 and 5 are case studies each based on a random trial, similar phenomenon Do the main claims made in the abstract and introduction accurately reflect the paper's Did you state the full set of assumptions of all theoretical results? Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] In the Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)?

embedding principle, fully-connected neural network, manifold, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Embedding Principle of Loss Landscape of Deep Neural Networks

Neural Information Processing SystemsOct-11-2024, 09:39:33 GMT

Understanding the structure of loss landscape of deep neural networks (DNNs) is obviously important. In this work, we prove an embedding principle that the loss landscape of a DNN "contains" all the critical points of all the narrower DNNs. More precisely, we propose a critical embedding such that any critical point, e.g., local or global minima, of a narrower DNN can be embedded to a critical point/affine subspace of the target DNN with higher degeneracy and preserving the DNN output function. Note that, given any training data, differentiable loss function and differentiable activation function, this embedding structure of critical points holds.This general structure of DNNs is starkly different from other nonconvex problems such as protein-folding.Empirically, we find that a wide DNN is often attracted by highly-degenerate critical points that are embedded from narrow DNNs. The embedding principle provides a new perspective to study the general easy optimization of wide DNNs and unravels a potential implicit low-complexity regularization during the training.Overall, our work provides a skeleton for the study of loss landscape of DNNs and its implication, by which a more exact and comprehensive understanding can be anticipated in the near future.

deep neural network, dnn, embedding principle, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

Zhang, Yaoyu, Li, Yuqing, Zhang, Zhongwang, Luo, Tao, Xu, Zhi-Qin John

arXiv.org Artificial IntelligenceNov-30-2021

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i.e., loss landscape of an NN contains all critical points of all the narrower NNs. This result is obtained by constructing a class of critical embeddings which map any critical point of a narrower NN to a critical point of the target NN with the same output function. By discovering a wide class of general compatible critical embeddings, we provide a gross estimate of the dimension of critical submanifolds embedded from critical points of narrower NNs. We further prove an irreversiblility property of any critical embedding that the number of negative/zero/positive eigenvalues of the Hessian matrix of a critical point may increase but never decrease as an NN becomes wider through the embedding. Using a special realization of general compatible critical embedding, we prove a stringent necessary condition for being a "truly-bad" critical point that never becomes a strict-saddle point through any critical embedding. This result implies the commonplace of strict-saddle points in wide NNs, which may be an important reason underlying the easy optimization of wide NNs widely observed in practice.

critical point, loss landscape, narr, (16 more...)

arXiv.org Artificial Intelligence

2111.15527

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback