Geospatial distributions reflect rates of evolution of features of language

Kauhanen, Henri, Gopal, Deepthi, Galla, Tobias, Bermúdez-Otero, Ricardo

arXiv.org Artificial Intelligence 

Quantifying the speed of linguistic change is challenging due to the fact that the historical evolution of languages is sparsely documented. Consequently, traditional methods rely on phylogenetic reconstruction. In this paper, we propose a model-based approach to the problem through the analysis of language change as a stochastic process combining vertical descent, spatial interactions, and mutations in both dimensions. A notion of linguistic temperature emerges naturally from this analysis as a dimensionless measure of the propensity of a linguistic feature to undergo change. We demonstrate how temperatures of linguistic features can be inferred from their present-day geospatial distributions, without recourse to information about their phylogenies. Thus the evolutionary dynamics of language, operating across thousands of years, leaves a measurable geospatial signature. This signature licenses inferences about the historical evolution of languages even in the absence of longitudinal data.