Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

Neural Information Processing Systems 

Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam.