On variation of gradients of deep neural networks