Prompt and circumstance: A word-by-word LLM prompting approach to interlinear glossing for low-resource languages

Feb-13-2025–arXiv.org Artificial Intelligence

Partly automated creation of interlinear glossed text (IGT) has the potential to assist in linguistic documentation. We argue that LLMs can make this process more accessible to linguists because of their capacity to follow natural-language instructions. We investigate the effectiveness of a retrieval-based LLM prompting approach to glossing, applied to the seven languages from the SIGMORPHON 2023 shared task. Our system beats the BERT-based shared task baseline for every language in the morpheme-level score category, and we show that a simple 3-best oracle has higher word-level scores than the challenge winner (a tuned sequence model) in five languages. In a case study on Tsez, we ask the LLM to automatically create and follow linguistic instructions, reducing errors on a confusing grammatical feature. Our results thus demonstrate the potential contributions which LLMs can make in interactive systems for glossing, both in making suggestions to human annotators and following directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Feb-13-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Sudan (0.04)
- North America
  - United States
    - Ohio (0.04)
    - New Mexico > Santa Fe County
      - Santa Fe (0.04)
    - Florida > Miami-Dade County
      - Miami (0.04)
    - Colorado > Boulder County
      - Boulder (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Germany
    - Saxony > Leipzig (0.04)
    - Baden-Württemberg > Tübingen Region
      - Tübingen (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found