CompactQE: Interpretable Translation Quality Estimation via Small Open-Weight LLMs

📅 2026-05-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

233K/year
🤖 AI Summary
This study addresses the privacy risks and high costs associated with current machine translation quality evaluation methods that rely on large, closed-source language models. The authors propose a single-prompt strategy leveraging open-source large language models with fewer than 30 billion parameters to simultaneously generate quality scores, MQM error annotations, correction suggestions, and fully edited translations. Experimental results demonstrate that this approach achieves evaluation outcomes highly correlated with human judgments while preserving data privacy and substantially reducing computational costs. Its performance rivals that of large closed-source models and surpasses conventional neural metrics, fine-tuned models, and even inter-annotator agreement among human evaluators, offering a highly efficient and interpretable alternative for translation quality assessment.
📝 Abstract
Current state-of-the-art Quality Estimation (QE) in machine translation relies on massive, proprietary LLMs, raising data privacy concerns. We demonstrate that smaller, open-source LLMs (<30B parameters) are a viable, cost-effective and privacy-preserving alternative. Using a single-pass prompting strategy, our models simultaneously generate quality scores, MQM error annotations, suggested error corrections, and full post-editions. Our analysis shows these models achieve highly competitive system-level correlations with human judgments that outperform traditional neural metrics, fine-tuned models, and human inter-annotator agreement, effectively approximating the capabilities of much larger proprietary LLMs.
Problem

Research questions and friction points this paper is trying to address.

Quality Estimation
Machine Translation
Data Privacy
Large Language Models
Open-Weight Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quality Estimation
Open-weight LLMs
Single-pass Prompting
MQM Error Annotation
Machine Translation
🔎 Similar Papers
No similar papers found.