A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks Ant Group Hangzhou, China