Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

Xie, Roy, Ahia, Orevaoghene, Tsvetkov, Yulia, Anastasopoulos, Antonios

Mar-23-2024–arXiv.org Artificial Intelligence

Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore both post-hoc and intrinsic approaches to interpretability, conduct experiments on Mandarin, Italian, and Low Saxon, and experimentally demonstrate that our method successfully identifies key language-specific lexical features that contribute to dialectal variations.

computational linguistic, dialect, explanation, (15 more...)

arXiv.org Artificial Intelligence

Mar-23-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Oregon > Multnomah County
      - Portland (0.04)
    - New Mexico > Santa Fe County
      - Santa Fe (0.04)
- Europe
  - Netherlands > Overijssel (0.04)
  - Iceland > Capital Region
    - Reykjavik (0.04)
  - Germany
    - Mecklenburg-Vorpommern (0.04)
    - Lower Saxony (0.04)
  - Romania > Sud - Muntenia Development Region
    - Giurgiu County > Giurgiu (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Italy
    - Veneto (0.04)
    - Sicily (0.04)
    - Liguria (0.04)
    - Emilia-Romagna (0.04)
    - Campania (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Taiwan (0.05)
  - Indonesia > Bali (0.04)
  - China > Hong Kong (0.04)
  - Thailand > Phuket
    - Phuket (0.04)

Genre:
- Research Report (1.00)

Industry:
- Government > Regional Government > North America Government > United States Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found