Can Word Sense Distribution Detect Semantic Changes of Words?
Tang, Xiaohang, Zhou, Yi, Aida, Taichi, Sen, Procheta, Bollegala, Danushka
–arXiv.org Artificial Intelligence
Semantic Change Detection (SCD) of words is an important task for various NLP applications that must make time-sensitive predictions. Some words are used over time in novel ways to express new meanings, and these new meanings establish themselves as novel senses of existing words. On the other hand, Word Sense Disambiguation (WSD) methods associate ambiguous words with sense ids, depending on the context in which they occur. Given this relationship between WSD and SCD, we explore the possibility of predicting whether a target word has its meaning changed between two corpora collected at different time steps, by comparing the distributions of senses of that word in each corpora. For this purpose, we use pretrained static sense embeddings to automatically annotate each occurrence of the target word in a corpus with a sense id. Next, we compute the distribution of sense ids of a target word in a given corpus. Finally, we use different divergence or distance measures to quantify the semantic change of the target word across the two given corpora. Our experimental results on SemEval 2020 Task 1 dataset show that word sense distributions can be accurately used to predict semantic changes of words in English, German, Swedish and Latin.
arXiv.org Artificial Intelligence
Oct-16-2023
- Country:
- Oceania > Australia
- Australian Capital Territory > Canberra (0.05)
- North America
- Dominican Republic (0.04)
- United States
- New Jersey (0.04)
- Maryland > Baltimore (0.04)
- Washington > King County
- Seattle (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- California > San Diego County
- San Diego (0.04)
- Canada
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Europe
- Germany > Berlin (0.04)
- United Kingdom > England
- Merseyside > Liverpool (0.04)
- Cambridgeshire > Cambridge (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Asia
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- China > Shanghai
- Shanghai (0.04)
- Middle East > UAE
- Oceania > Australia
- Genre:
- Research Report (1.00)
- Technology: