Understanding Why Neural Networks Generalize Well Through GSNR of Parameters

Liu, Jinlong, Jiang, Guoqing, Bai, Yunzhi, Chen, Ting, Wang, Huayan

Jan-21-2020–arXiv.org Machine Learning

GSNR of a parameter is defined as the ratio between its gradient's squared mean and Previous work (Zhang et al., 2016; Hardt et al., 2015; Dziugaite & Roy, 2017) suggests that the The GSNR of a parameter is defined as the ratio between its gradient's squared mean and variance Previous work tried to use GSNR to conduct theoretical analysis on deep learning. For example, Rainforth et al. (2018) used GSNR to analyze variational bounds in Intuitively, GSNR measures the similarity of a parameter's gradients among different training samples. To reveal the mechanism of DNNs' good generalization ability, we show that the gradient descent We believe this is probably the key to DNNs' remarkable generalization ability. In the remainder of this paper we first analyze the relation between GSNR and generalization (Section 2). At a particular point of the parameter space, GSNR measures the consistency of a parameter's gradients across different data samples.

gradient, gsnr, model parameter, (15 more...)

arXiv.org Machine Learning

Jan-21-2020

arXiv.org PDF

Add feedback

Country:
- Asia > China > Beijing > Beijing (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.49)
  - Statistical Learning > Gradient Descent (0.36)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found