The Benefits of Over-parameterization at Initialization in Deep ReLU Networks

Open in new window