Parallel training of DNNs with Natural Gradient and Parameter Averaging