Goto

Collaborating Authors

 cumulant








cf9dc5e4e194fc21f397b4cac9cc3ae9-Paper.pdf

Neural Information Processing Systems

However, the structure of their hidden layer representations is only theoretically well-understood incertain infinite-width limits, inwhichtheserepresentations cannot flexibly adapt tolearn data-dependent features [3-11,24]. Inthe Bayesian setting, these representations are described by fixed, deterministic kernels [3-11].



Aself-consistenttheoryofGaussianProcesses capturesfeaturelearningeffectsinfiniteCNNs

Neural Information Processing Systems

Despite its theoretical appeal, this viewpoint lacks a crucial ingredient of deep learning in finite DNNs, laying at the heart of their success --feature learning. Here we consider DNNs trained with noisy gradient descent on a large training set and derive a self-consistent Gaussian Process theory accounting forstrongfinite-DNN and feature learning effects.