Towards Better Generalization: BP-SVRG in Training Deep Neural Networks