Syntactic Transfer to Kyrgyz Using the Treebank Translation Method
Alekseev, Anton, Tillabaeva, Alina, Kabaeva, Gulnara Dzh., Nikolenko, Sergey I.
–arXiv.org Artificial Intelligence
The Kyrgyz language, as a low-resource language, requires significant effort to create high-quality syntactic corpora. This study proposes an approach to simplify the development process of a syntactic corpus for Kyrgyz. We present a tool for transferring syntactic annotations from Turkish to Kyrgyz based on a treebank translation method. The effectiveness of the proposed tool was evaluated using the TueCL treebank. The results demonstrate that this approach achieves higher syntactic annotation accuracy compared to a monolingual model trained on the Kyrgyz KTMU treebank. Additionally, the study introduces a method for assessing the complexity of manual annotation for the resulting syntactic trees, contributing to further optimization of the annotation process.
arXiv.org Artificial Intelligence
Dec-17-2024
- Country:
- Africa > Middle East
- Egypt > Cairo Governorate > Cairo (0.04)
- Asia
- Kyrgyzstan > Chüy Region
- Bishkek (0.04)
- Russia (0.05)
- Singapore (0.04)
- Kyrgyzstan > Chüy Region
- Europe
- North America > United States
- Michigan > Washtenaw County > Ann Arbor (0.04)
- Africa > Middle East
- Genre:
- Research Report > New Finding (0.66)
- Technology: