Stronger Convergence Results for Deep Residual Networks: Network Width Scales Linearly with Training Data Size

Open in new window