An Improved Empirical Fisher Approximation for Natural Gradient Descent Xiaodong Wu1 Philip Woodland

Mar-27-2025, 14:46:49 GMT–Neural Information Processing Systems

Approximate Natural Gradient Descent (NGD) methods are an important family of optimisers for deep learning models, which use approximate Fisher information matrices to pre-condition gradients during training. The empirical Fisher (EF) method approximates the Fisher information matrix empirically by reusing the per-sample gradients collected during back-propagation. Despite its ease of implementation, the EF approximation has its theoretical and practical limitations. This paper investigates the inversely-scaled projection issue of EF, which is shown to be a major cause of its poor empirical approximation quality. An improved empirical Fisher (iEF) method is proposed to address this issue, which is motivated as a generalised NGD method from a loss reduction perspective, meanwhile retaining the practical convenience of EF.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Mar-27-2025, 14:46:49 GMT

Conferences PDF

Add feedback

Country:
- Europe (1.00)
- North America
  - Canada > Ontario
    - Toronto (0.14)
  - United States (0.92)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Education (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found