The Effect of Masking Strategies on Knowledge Retention by Language Models

Open in new window