How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data?

Open in new window