Can a Neural Model Guide Fieldwork? A Case Study on Morphological Data Collection
Mahmudi, Aso, Herce, Borja, Amestica, Demian Inostroza, Scherbakov, Andreas, Hovy, Eduard, Vylomova, Ekaterina
–arXiv.org Artificial Intelligence
Linguistic fieldwork is an important component in language documentation and preservation. However, it is a long, exhaustive, and time-consuming process. This paper presents a novel model that guides a linguist during the fieldwork and accounts for the dynamics of linguist-speaker interactions. We introduce a novel framework that evaluates the efficiency of various sampling strategies for obtaining morphological data and assesses the effectiveness of state-of-the-art neural models in generalising morphological structures. Our experiments highlight two key strategies for improving the efficiency: (1) increasing the diversity of annotated data by uniform sampling among the cells of the paradigm tables, and (2) using model confidence as a guide to enhance positive interaction by providing reliable predictions during annotation.
arXiv.org Artificial Intelligence
Dec-14-2024
- Country:
- Asia
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East
- Europe
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Germany > North Rhine-Westphalia
- Upper Bavaria > Munich (0.04)
- Hungary > Jász-Nagykun-Szolnok County
- Szolnok (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- France > Provence-Alpes-Côte d'Azur
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Georgia > Fulton County
- Atlanta (0.04)
- Texas > Travis County
- Austin (0.04)
- Washington > King County
- Seattle (0.04)
- Georgia > Fulton County
- Canada > Ontario
- Oceania > Australia
- Australian Capital Territory > Canberra (0.04)
- Asia
- Genre:
- Research Report
- New Finding (0.93)
- Promising Solution (0.66)
- Research Report
- Technology: