Understanding Deflation Process in Over-parametrized Tensor Decomposition

Neural Information Processing Systems 

Recently, over-parametrization has been recognized as a key feature of neural network optimization. A line of works known as the Neural Tangent Kernel (NTK) showed that it is possible to achieve zero training loss when the network is sufficiently over-parametrized (Jacot et al., 2018; Du et al., 2018; Allen-Zhu et al., 2018b). However, the theory of NTK implies a particular dynamics called lazy training where the neurons do not move much (Chizat et al., 2019), which is not natural in

Similar Docs  Excel Report  more

TitleSimilaritySource
None found