MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

In high-quality machine translation (MT) human evaluation, performance gains are often obscured by annotation noise. To address this, we propose a two-stage MQM-based collaborative re-annotation method: building upon initial annotations, it introduces iterative human review and collaborative refinement to jointly optimize primary annotations, peer annotations, and automatic predictions. This work is the first to integrate collaborative editing behavior modeling into the MQM framework. It significantly improves annotation consistency (+18.3%) and error detection rate (+24.7%), effectively recovering errors missed in the first round. Experiments demonstrate that re-annotation substantially enhances the reliability and stability of evaluation outcomes, enabling more accurate reflection of model quality improvements in assessment scores. The proposed approach establishes a scalable, reproducible paradigm for high-precision MT evaluation.

Technology Category

Application Category

📝 Abstract

Human evaluation of machine translation is in an arms race with translation model quality: as our models get better, our evaluation methods need to be improved to ensure that quality gains are not lost in evaluation noise. To this end, we experiment with a two-stage version of the current state-of-the-art translation evaluation paradigm (MQM), which we call MQM re-annotation. In this setup, an MQM annotator reviews and edits a set of pre-existing MQM annotations, that may have come from themselves, another human annotator, or an automatic MQM annotation system. We demonstrate that rater behavior in re-annotation aligns with our goals, and that re-annotation results in higher-quality annotations, mostly due to finding errors that were missed during the first pass.

Problem

Research questions and friction points this paper is trying to address.

Improving human evaluation methods for machine translation

Developing a two-stage MQM re-annotation technique

Enhancing annotation quality by finding missed errors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage MQM re-annotation technique for evaluation

Reviewing and editing pre-existing MQM annotations collaboratively

Improving annotation quality by catching missed errors

🔎 Similar Papers

No similar papers found.