Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior

Open in new window