Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the high computational cost and substantial latency of large language models (LLMs) in machine translation, this paper proposes an efficient hybrid neural machine translation (NMT)–LLM scheduling framework. The method introduces the first dynamic scheduler based on lightweight source-sentence representations, which employs a synergistic rule- and learning-based strategy to invoke the LLM only when NMT fails to meet quality thresholds. The framework tightly integrates an NMT backbone, LLM-based post-editing capabilities, source-side feature extraction, and an interpretable scheduling mechanism. Experiments across multilingual benchmarks demonstrate that translation quality approaches that of full-LLM systems, while LLM invocation rates drop by 60%–85% and end-to-end latency nears that of pure NMT. This yields a significantly improved efficiency–quality trade-off.

Technology Category

Application Category

📝 Abstract

Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result shows that we can achieve optimal translation performance with minimal LLM usage, demonstrating effectiveness of our decider.

Problem

Research questions and friction points this paper is trying to address.

Hybrid NMT and LLM translation optimization

Reduce LLM usage while maintaining quality

Efficient scheduling policy for translation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid NMT and LLM for efficient translation

Source sentence features guide scheduling policy

Minimal LLM usage ensures optimal performance

🔎 Similar Papers

No similar papers found.

Authors to Follow