Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost and substantial latency of large language models (LLMs) in machine translation, this paper proposes an efficient hybrid neural machine translation (NMT)–LLM scheduling framework. The method introduces the first dynamic scheduler based on lightweight source-sentence representations, which employs a synergistic rule- and learning-based strategy to invoke the LLM only when NMT fails to meet quality thresholds. The framework tightly integrates an NMT backbone, LLM-based post-editing capabilities, source-side feature extraction, and an interpretable scheduling mechanism. Experiments across multilingual benchmarks demonstrate that translation quality approaches that of full-LLM systems, while LLM invocation rates drop by 60%–85% and end-to-end latency nears that of pure NMT. This yields a significantly improved efficiency–quality trade-off.

Technology Category

Application Category

📝 Abstract
Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result shows that we can achieve optimal translation performance with minimal LLM usage, demonstrating effectiveness of our decider.
Problem

Research questions and friction points this paper is trying to address.

Hybrid NMT and LLM translation optimization
Reduce LLM usage while maintaining quality
Efficient scheduling policy for translation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid NMT and LLM for efficient translation
Source sentence features guide scheduling policy
Minimal LLM usage ensures optimal performance
🔎 Similar Papers
No similar papers found.
Zhanglin Wu
Zhanglin Wu
2012 Lab, Huawei Co. LTD
Machine TranslationNatural Language Processing
D
Daimeng Wei
Huawei Translation Service Center, Beijing, China
X
Xiaoyu Chen
Huawei Translation Service Center, Beijing, China
H
Hengchao Shang
Huawei Translation Service Center, Beijing, China
J
Jiaxin Guo
Huawei Translation Service Center, Beijing, China
Zongyao Li
Zongyao Li
Huawei Translation Service Center, Beijing, China
Yuanchang Luo
Yuanchang Luo
2012@Huawei
J
Jinlong Yang
Huawei Translation Service Center, Beijing, China
Zhiqiang Rao
Zhiqiang Rao
Huawei
NLP
H
Hao Yang
Huawei Translation Service Center, Beijing, China