How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

Open in new window