How Width and Data Shape Generalization Scaling Laws in Quadratic Neural Networks

Open in new window