Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks

Open in new window