A Massive Scale Semantic Similarity Dataset of Historical English

Neural Information Processing Systems 

A diversity of tasks use language models trained on semantic similarity data. While there are a variety of datasets that capture semantic similarity, they are either constructed from modern web data or are relatively small datasets created in the past decade by human annotators.