Stochastic Weight Averaging in Parallel: Large-Batch Training that Generalizes Well

Open in new window