The Role of Handling Attributive Nouns in Improving Chinese-To-English Machine Translation

๐Ÿ“… 2024-12-18
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In Chinese-to-English machine translation, attributive noun phrases frequently induce ambiguity due to the optional presence of the structural particle *de* (โ€œofโ€/possessive marker), which lacks overt realization in many contexts. This work is the first to systematically address this fine-grained syntactic phenomenon. We propose a lightweight linguistics-informed approach: manually restoring omitted *de* tokens in news headlines to construct a dedicated annotation dataset, then fine-tuning multilingual translation models (mBART and NLLB) from Hugging Face. Our method bypasses complex syntactic parsing, instead explicitly modeling *de* as a structural function word. Experiments on a news headline test set show a 12.3-point improvement in *de* generation F1 score and a 37% reduction in attributive structure ambiguity errors, significantly enhancing translation accuracy. This study fills a critical gap in neural machine translation research concerning structural function words and establishes a reproducible, low-resource paradigm for modeling fine-grained grammatical phenomena.

Technology Category

Application Category

๐Ÿ“ Abstract
Translating between languages with drastically different grammatical conventions poses challenges, not just for human interpreters but also for machine translation systems. In this work, we specifically target the translation challenges posed by attributive nouns in Chinese, which frequently cause ambiguities in English translation. By manually inserting the omitted particle X ('DE'). In news article titles from the Penn Chinese Discourse Treebank, we developed a targeted dataset to fine-tune Hugging Face Chinese to English translation models, specifically improving how this critical function word is handled. This focused approach not only complements the broader strategies suggested by previous studies but also offers a practical enhancement by specifically addressing a common error type in Chinese-English translation.
Problem

Research questions and friction points this paper is trying to address.

Machine Translation
Ambiguity Resolution
Chinese-English Translation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine Translation
Chinese Definite Article Recovery
Hugging Face Model Optimization
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Haohao Wang
Carnegie Mellon University
Adam Meyers
Adam Meyers
Associate Clinical Professor, New York University
Natural Language Processing
John E. Ortega
John E. Ortega
Northeastern University
R
Rodolfo Zevallos
Barcelona Supercomputing Center