Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers
Xie, Roy, Ahia, Orevaoghene, Tsvetkov, Yulia, Anastasopoulos, Antonios
–arXiv.org Artificial Intelligence
Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore both post-hoc and intrinsic approaches to interpretability, conduct experiments on Mandarin, Italian, and Low Saxon, and experimentally demonstrate that our method successfully identifies key language-specific lexical features that contribute to dialectal variations.
arXiv.org Artificial Intelligence
Mar-23-2024
- Country:
- Asia
- Europe
- Netherlands > Overijssel (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy
- Campania (0.04)
- Emilia-Romagna (0.04)
- Liguria (0.04)
- Sicily (0.04)
- Veneto (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Germany
- Lower Saxony (0.04)
- Mecklenburg-Vorpommern (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- North America
- Dominican Republic (0.04)
- United States
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- New Mexico > Santa Fe County
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Technology: