AITopics | local hessian

Collaborating Authors

local hessian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Local properties of neural networks through the lens of layer-wise Hessians

Bolshim, Maxim, Kugaevskikh, Alexander

arXiv.org Artificial IntelligenceNov-11-2025

We introduce a methodology for analyzing neural networks through the lens of layer-wise Hessian matrices. The local Hessian of each functional block (layer) is defined as the matrix of second derivatives of a scalar function with respect to the parameters of that layer. This concept provides a formal tool for characterizing the local geometry of the parameter space. We show that the spectral properties of local Hessians, such as the distribution of eigenvalues, reveal quantitative patterns associated with overfitting, underparameterization, and expressivity in neural network architectures. We conduct an extensive empirical study involving 111 experiments across 37 datasets. The results demonstrate consistent structural regularities in the evolution of local Hessians during training and highlight correlations between their spectra and generalization performance. These findings establish a foundation for using local geometric analysis to guide the diagnosis and design of deep neural networks. The proposed framework connects optimization geometry with functional behavior and offers practical insight for improving network architectures and training stability.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.17486

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.46)
Social Sector (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

On the Local Hessian in Back-propagation

Neural Information Processing SystemsSep-29-2025, 22:56:19 GMT

artificial intelligence, hessian, machine learning, (7 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Local Hessian in Back-propagation

Huishuai Zhang, Wei Chen, Tie-Yan Liu

Neural Information Processing SystemsSep-29-2025, 19:41:58 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
Asia (0.28)

Industry: Energy > Oil & Gas (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Local Hessian in Back-propagation

Huishuai Zhang, Wei Chen, Tie-Yan Liu

Neural Information Processing SystemsSep-28-2025, 03:03:09 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
Asia (0.28)

Industry: Energy > Oil & Gas (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: On the Local Hessian in Back-propagation

Neural Information Processing SystemsOct-8-2024, 01:43:18 GMT

They propose that backpropagation with respect to a loss function is equivalent to a single step of a "back-matching propagation" procedure in which, after a forward evaluation, we alternately optimize the weights and input activations for each block to minimize a loss for the block's output. The authors propose that architectures and training procedures which improve the condition number of the Hessian of this back-matching loss are more efficient and support this by analytically studying the effects of orthonormal initialization, skip connections, and batch-norm. They offer further evidence for this characterization by designing a blockwise learning-rate scaling method based on an approximation of the backmatching loss and demonstrating an improved learning curve for VGG13 on CIFAR10 and CIFAR100.

back-propagation, hessian, procedure, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Local Hessian in Back-propagation

Zhang, Huishuai, Chen, Wei, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 18:41:55 GMT

artificial intelligence, local hessian, neural network, (5 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Local Hessian in Back-propagation

Zhang, Huishuai, Chen, Wei, Liu, Tie-Yan

Neural Information Processing SystemsDec-31-2018

Back-propagation (BP) is the foundation for successfully training deep neural networks. However, BP sometimes has difficulties in propagating a learning signal deep enough effectively, e.g., the vanishing gradient phenomenon. Meanwhile, BP often works well when combining with ``designing tricks'' like orthogonal initialization, batch normalization and skip connection. There is no clear understanding on what is essential to the efficiency of BP. In this paper, we take one step towards clarifying this problem. We view BP as a solution of back-matching propagation which minimizes a sequence of back-matching losses each corresponding to one block of the network. We study the Hessian of the local back-matching loss (local Hessian) and connect it to the efficiency of BP. It turns out that those designing tricks facilitate BP by improving the spectrum of local Hessian. In addition, we can utilize the local Hessian to balance the training pace of each block and design new training algorithms. Based on a scalar approximation of local Hessian, we propose a scale-amended SGD algorithm. We apply it to train neural networks with batch normalization, and achieve favorable results over vanilla SGD. This corroborates the importance of local Hessian from another side.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America > Canada > Ontario > Toronto (0.14)
Europe (0.14)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback