Exact natural gradient in deep linear networks and its application to the nonlinear case