Reviews: Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
–Neural Information Processing Systems
This paper considers the problem of square-loss regression for several function classes computed by neural networks. In contrast to what previous hardness results might suggest, they show that those function classes can be improperly learnt under marginal distribution assumptions alone. Specifically, they use previous structural results of approximate RKHS embedding and assume an eigenvalue decay assumption on the empirical gram matrix (namely, it is approximately low-rank). They then apply a recursive Nystrom sampling method to obtain an approximate and compressed version of the gram matrix, which is then used instead in the learning. The error of the resulting regression is then bounded using known compression generalization bounds, hence obtaining generalization bounds for kernelized regression using Compression Schemes.
Neural Information Processing Systems
Oct-7-2024, 21:29:43 GMT
- Technology: