🤖 AI Summary
This study addresses coreference resolution for Vietnamese narrative texts under low-resource conditions. We construct the first high-quality, manually annotated Vietnamese coreference dataset—derived from VnExpress news articles—thereby filling a critical gap in annotated resources for this language. Employing standardized annotation guidelines and prompt engineering, we systematically evaluate GPT-3.5-Turbo and GPT-4 under zero-shot and few-shot settings. Experimental results demonstrate that GPT-4 significantly outperforms GPT-3.5-Turbo in both accuracy and response consistency, confirming its viability as a practical tool for Vietnamese coreference resolution. Our key contributions are: (1) the release of the first Vietnamese coreference resolution benchmark specifically designed for narrative text; and (2) the first empirical investigation into the capabilities and limitations of large language models on coreference tasks in low-resource languages, establishing concrete evidence of their applicability and performance boundaries.
📝 Abstract
Coreference resolution is a vital task in natural language processing (NLP) that involves identifying and linking different expressions in a text that refer to the same entity. This task is particularly challenging for Vietnamese, a low-resource language with limited annotated datasets. To address these challenges, we developed a comprehensive annotated dataset using narrative texts from VnExpress, a widely-read Vietnamese online news platform. We established detailed guidelines for annotating entities, focusing on ensuring consistency and accuracy. Additionally, we evaluated the performance of large language models (LLMs), specifically GPT-3.5-Turbo and GPT-4, on this dataset. Our results demonstrate that GPT-4 significantly outperforms GPT-3.5-Turbo in terms of both accuracy and response consistency, making it a more reliable tool for coreference resolution in Vietnamese.