AITopics | kfac

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

Neural Information Processing SystemsMar-19-2026, 17:03:31 GMT

Physics-Informed Neural Networks (PINNs) are infamous for being hard to train.Recently, second-order methods based on natural gradient and Gauss-Newton methods have shown promising performance, improving the accuracy achieved by first-order methods by several orders of magnitude. While promising, the proposed methods only scale to networks with a few thousand parameters due to the high computational cost to evaluate, store, and invert the curvature matrix.We propose Kronecker-factored approximate curvature (KFAC) for PINN losses that greatly reduces the computational cost and allows scaling to much larger networks.Our approach goes beyond the popular KFAC for traditional deep learning problems as it captures contributions from a PDE's differential operator that are crucial for optimization. To establish KFAC for such losses, we use Taylor-mode automatic differentiation to describe the differential operator's computation graph as a forward network with shared weights which allows us to apply a variant of KFAC for networks with weight-sharing. Empirically, we find that our KFAC-based optimizers are competitive with expensive second-order methods on small problems, scale more favorably to higher-dimensional neural networks and PDEs, and consistently outperform first-order methods.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Neural Information Processing SystemsMar-16-2026, 20:26:45 GMT

For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approximations and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

artificial intelligence, name change, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

Neural Information Processing SystemsFeb-19-2026, 17:44:25 GMT

Neural Information Processing Systems http://nips.cc/

approximation, ekfac, kfac, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Oceania > Tonga (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ec5aa0b7846082a2415f0902f0da88f2-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 18:26:37 GMT

approximation, continual learning, matrix, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

dae3312c4c6c7000a37ecfb7b0aeb0e4-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 11:05:37 GMT

algorithm, matrix, shampoo, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

50d005f92a6c5c9646db4b761da676ba-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:39:00 GMT

approximation, invariance, neural network, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Neural Information Processing SystemsNov-20-2025, 22:08:16 GMT

For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approximations and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

fast approximate natural gradient descent, kronecker factored eigenbasis, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

Neural Information Processing SystemsNov-20-2025, 16:06:49 GMT

Stochastic Gradient Descent (SGD) and its variants are the current workhorse for training neural networks.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Oceania > Tonga (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MAC: An Efficient Gradient Preconditioning using Mean Activation Approximated Curvature

Seung, Hyunseok, Lee, Jaewoo, Ko, Hyunsuk

arXiv.org Artificial IntelligenceNov-12-2025

Second-order optimization methods for training neural networks, such as KFAC, exhibit superior convergence by utilizing curvature information of loss landscape. However, it comes at the expense of high computational burden. In this work, we analyze the two components that constitute the layer-wise Fisher information matrix (FIM) used in KFAC: the Kronecker factors related to activations and pre-activation gradients. Based on empirical observations on their eigenspectra, we propose efficient approximations for them, resulting in a computationally efficient optimization method called MAC. To the best of our knowledge, MAC is the first algorithm to apply the Kronecker factorization to the FIM of attention layers used in transformers and explicitly integrate attention scores into the preconditioning. We also study the convergence property of MAC on nonlinear neural networks and provide two conditions under which it converges to global minima. Our extensive evaluations on various network architectures and datasets show that the proposed method outperforms KFAC and other state-of-the-art methods in terms of accuracy, end-to-end training time, and memory usage.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.08464

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: