Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

Open in new window