Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Open in new window