The Effect of Network Width on the Performance of Large-batch Training

Open in new window