AITopics | critical initialization

Collaborating Authors

critical initialization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Neural Information Processing SystemsDec-26-2025, 05:28:17 GMT

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity, the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows one to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new practical way to diagnose criticality.

critical initialization, partial jacobian, wide and deep neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Neural Information Processing SystemsJan-19-2025, 10:28:20 GMT

critical initialization, general theory and application, wide and deep neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Scaling and Resizing Symmetry in Feedforward Networks

Cardona, Carlos

arXiv.org Artificial IntelligenceJun-26-2023

Weights initialization in deep neural networks have a strong impact on the speed of converge of the learning map. Recent studies have shown that in the case of random initializations, a chaos/order phase transition occur in the space of variances of random weights and biases. Experiments then had shown that large improvements can be made, in terms of the training speed, if a neural network is initialized on values along the critical line of such phase transition. In this contribution, we show evidence that the scaling property exhibited by physical systems at criticality, is also present in untrained feedforward networks with random weights initialization at the critical line. Additionally, we suggest an additional data-resizing symmetry, which is directly inherited from the scaling symmetry at criticality.

correlation, initialization, mean correlation, (16 more...)

arXiv.org Artificial Intelligence

2306.15015

Country:

North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Sardinia (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AutoInit: Automatic Initialization via Jacobian Tuning

He, Tianyu, Doshi, Darshil, Gromov, Andrey

arXiv.org Machine LearningJun-27-2022

Good initialization is essential for training Deep Neural Networks (DNNs). Oftentimes such initialization is found through a trial and error approach, which has to be applied anew every time an architecture is substantially modified, or inherited from smaller size networks leading to sub-optimal initialization. In this work we introduce a new and cheap algorithm, that allows one to find a good initialization automatically, for general feed-forward DNNs. The algorithm utilizes the Jacobian between adjacent network blocks to tune the network hyperparameters to criticality. We solve the dynamics of the algorithm for fully connected networks with ReLU and derive conditions for its convergence. We then extend the discussion to more general architectures with BatchNorm and residual connections. Finally, we apply our method to ResMLP and VGG architectures, where the automatic one-shot initialization found by our method shows good performance on vision tasks.

artificial intelligence, initialization, machine learning, (17 more...)

arXiv.org Machine Learning

2206.13568

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Doshi, Darshil, He, Tianyu, Gromov, Andrey

arXiv.org Machine LearningNov-30-2021

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new way to diagnose (both theoretically and empirically) this criticality. To that end, we introduce partial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0

jacobian, layernorm, wide and deep neural network, (11 more...)

arXiv.org Machine Learning

2111.12143

Country:

North America > United States > Rhode Island (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback