Training Over-parameterized Deep ResNet Is almost as Easy as Training a Two-layer Network

Open in new window