DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages
Schlechtweg, Dominik, Tahmasebi, Nina, Hengchen, Simon, Dubossarsky, Haim, McGillivray, Barbara
–arXiv.org Artificial Intelligence
Word meaning is notoriously difficult to capture, both synchronically and diachronically. In this paper, we describe the creation of the largest resource of graded contextualized, diachronic word meaning annotation in four different languages, based on 100,000 human semantic proximity judgments. We thoroughly describe the multi-round incremental annotation process, the choice for a clustering algorithm to group usages into senses, and possible - diachronic and synchronic - uses for this dataset.
arXiv.org Artificial Intelligence
Jul-8-2024
- Country:
- North America
- United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.28)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County > Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- New York > New York County
- Canada > British Columbia
- United States
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Oxfordshire > Oxford (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Baden-Württemberg
- Stuttgart Region > Stuttgart (0.05)
- Tübingen Region > Tübingen (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- United Kingdom > England
- North America
- Genre:
- Research Report (0.64)
- Industry:
- Education (0.68)
- Technology: