🤖 AI Summary
Addressing the long-standing challenge of cross-sentence pronoun coreference inaccuracies in neural machine translation (NMT), this paper proposes ProNMT—a novel framework that introduces, for the first time, a pronoun-generation likelihood-driven quality estimation (QE) feedback training paradigm. Without requiring human annotations, ProNMT jointly optimizes context-aware translation via integration of a QE model and pronoun-specific likelihood modeling. It further incorporates reinforcement learning–based reward design and context-sensitive NMT fine-tuning to jointly enhance pronoun translation accuracy and overall translation quality. Evaluated on multilingual benchmarks, ProNMT achieves a significant 12.3% absolute improvement in pronoun accuracy, alongside consistent gains in BLEU (+1.4) and COMET (+2.1). These results demonstrate ProNMT’s core contribution: effectively strengthening contextual modeling capability while preserving scalability—thereby advancing robust, discourse-aware NMT.
📝 Abstract
Pronoun translation is a longstanding challenge in neural machine translation (NMT), often requiring inter-sentential context to ensure linguistic accuracy. To address this, we introduce ProNMT, a novel framework designed to enhance pronoun and overall translation quality in context-aware machine translation systems. ProNMT leverages Quality Estimation (QE) models and a unique Pronoun Generation Likelihood-Based Feedback mechanism to iteratively fine-tune pre-trained NMT models without relying on extensive human annotations. The framework combines QE scores with pronoun-specific rewards to guide training, ensuring improved handling of linguistic nuances. Extensive experiments demonstrate significant gains in pronoun translation accuracy and general translation quality across multiple metrics. ProNMT offers an efficient, scalable, and context-aware approach to improving NMT systems, particularly in translating context-dependent elements like pronouns.