Coreference Resolution for Vietnamese Narrative Texts
Tran, Hieu-Dai, Nguyen, Duc-Vu, Nguyen, Ngan Luu-Thuy
–arXiv.org Artificial Intelligence
Coreference resolution is a vital task in natural language processing (NLP) that involves identifying and linking different expressions in a text that refer to the same entity. This task is particularly challenging for Vietnamese, a low-resource language with limited annotated datasets. To address these challenges, we developed a comprehensive annotated dataset using narrative texts from VnExpress, a widely-read Vietnamese online news platform. We established detailed guidelines for annotating entities, focusing on ensuring consistency and accuracy. Additionally, we evaluated the performance of large language models (LLMs), specifically GPT-3.5-Turbo and GPT-4, on this dataset. Our results demonstrate that GPT-4 significantly outperforms GPT-3.5-Turbo in terms of both accuracy and response consistency, making it a more reliable tool for coreference resolution in Vietnamese.
arXiv.org Artificial Intelligence
Apr-29-2025
- Country:
- Asia
- Thailand (0.04)
- Vietnam > Hồ Chí Minh City
- Hồ Chí Minh City (0.04)
- Europe > Slovenia
- Drava > Municipality of Benedikt > Benedikt (0.04)
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Technology: