AITopics | approximate fisher information

Collaborating Authors

approximate fisher information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Neural Information Processing SystemsDec-24-2025, 05:31:06 GMT

Natural Gradient Descent (NGD) helps to accelerate the convergence of gradient descent dynamics, but it requires approximations in large-scale deep neural networks because of its high computational cost. Empirical studies have confirmed that some NGD methods with approximate Fisher information converge sufficiently fast in practice. Nevertheless, it remains unclear from the theoretical perspective why and under what conditions such heuristic approximations work well. In this work, we reveal that, under specific conditions, NGD with approximate Fisher information achieves the same fast convergence to global minima as exact NGD. We consider deep neural networks in the infinite-width limit, and analyze the asymptotic training dynamics of NGD in function space via the neural tangent kernel.

approximate fisher information, approximation, natural gradient descent, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Review for NeurIPS paper: Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Neural Information Processing SystemsJan-26-2025, 01:55:07 GMT

Additional Feedback: Line 1: Change to "Natural Gradient Descent..." Line 10, 11: "the function space" should just be "function space" Line 15: it might be worth pointing out here and/or in the intro that a special kind of data preprocessing (the "Forster transform") is required to get this result for K-FAC in general Line 16, 46: "under some assumptions"/"under specific conditions" should perhaps be replaced with "under some approximating assumptions". AFAIK the "gradient independence assumption" doesn't have any rigorous justification and might not even be true in practice. Line 69: "New insights and perspectives on the natural gradient method" also argues that the empirical Fisher is a poor substitute for the "true" one. Line 71: first quotation make is backwards Line 79: delete "firing" here Line 88: "We normalize each sample by" should be "We normalize each sample so that" Line 90: "we overview" should be "we give an overview of" Line 116: Although the use of damping in the context of NTK theory can be explained this way, damping has a larger role in second order optimization in general (where NTK theory doesn't necessarily apply). The way you are describing it though, it sounds like you are saying its use is fully explained by this theory, and I would suggest you change this.

approximate fisher information, assumption, natural gradient descent, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Review for NeurIPS paper: Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Neural Information Processing SystemsJan-26-2025, 01:55:00 GMT

This is a compelling paper which covers a lot of ground while keeping the presentation accessible and engaging for the reader. Interestingly, it finds that the K-FAC approximations match the exact NGD trajectory in function space but not weight space. The paper answers quite a lot of questions which are natural to ask, and (having worked a lot in this area) I found the answers interesting and novel. The reviewers seem to have checked it over pretty carefully and didn't spot any problems. The paper is well written, and the authors have clearly paid a lot of attention to the presentation of the ideas.

approximate fisher information, natural gradient descent, wide neural network, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Neural Information Processing SystemsOct-10-2024, 14:59:30 GMT

approximate fisher information, approximation, natural gradient descent, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback