A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data
Baes, Naomi, Merx, Raphaël, Haslam, Nick, Vylomova, Ekaterina, Dubossarsky, Haim
–arXiv.org Artificial Intelligence
Lexical Semantic Change (LSC) offers insights into cultural and social dynamics. Yet, the validity of methods for measuring kinds of LSC has yet to be established due to the absence of historical benchmark datasets. To address this gap, we develop a novel three-stage evaluation framework that involves: 1) creating a scalable, domain-general methodology for generating synthetic datasets that simulate theory-driven LSC across time, leveraging In-Context Learning and a lexical database; 2) using these datasets to evaluate the effectiveness of various methods; and 3) assessing their suitability for specific dimensions and domains. We apply this framework to simulate changes across key dimensions of LSC (SIB: Sentiment, Intensity, and Breadth) using examples from psychology, and evaluate the sensitivity of selected methods to detect these artificially induced changes. Our findings support the utility of the synthetic data approach, validate the efficacy of tailored methods for detecting synthetic changes in SIB, and reveal that a state-of-the-art LSC model faces challenges in detecting affective dimensions of LSC. This framework provides a valuable tool for dimension- and domain-specific bench-marking and evaluation of LSC methods, with particular benefits for the social sciences.
arXiv.org Artificial Intelligence
Mar-11-2025
- Country:
- Oceania > Australia
- Australian Capital Territory > Canberra (0.04)
- North America
- United States
- Illinois (0.04)
- New York > New York County
- New York City (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Florida > Miami-Dade County
- Miami (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- West Midlands > Birmingham (0.04)
- Oxfordshire > Oxford (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- United Kingdom > England
- Asia
- China > Hong Kong (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Oceania > Australia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Law (1.00)
- Health & Medicine
- Pharmaceuticals & Biotechnology (1.00)
- Consumer Health (1.00)
- Health Care Providers & Services (0.93)
- Therapeutic Area > Psychiatry/Psychology
- Addiction Disorder (1.00)
- Mental Health (0.68)
- Technology: