A Massive Scale Semantic Similarity Dataset of Historical English