(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts

📅 2024-05-20
🏛️ arXiv.org
📈 Citations: 21
Influential: 1
📄 PDF
🤖 AI Summary
Accurately translating ultra-long literary texts remains challenging due to the difficulty in preserving metaphors, culturally embedded meanings, and authorial style. Method: This paper proposes TransAgents, a multi-agent collaborative framework inspired by professional translation companies, comprising five specialized roles—CEO, Translator, Editor, Localization Specialist, and Proofreader—operating in two sequential phases: preparation and execution. It introduces a novel two-stage coordination paradigm and a hybrid evaluation system integrating monolingual human preferences (MHP) and bilingual large-model preferences (BLP), overcoming limitations of traditional metrics like BLEU. Contribution/Results: Experiments demonstrate that TransAgents significantly outperforms GPT-4 and human reference translations in cultural adaptation, stylistic consistency, and overall quality. Both human evaluators and LLMs consistently prefer its outputs, empirically validating multi-agent collaboration as an effective approach to enhancing literary translation quality.

Technology Category

Application Category

📝 Abstract
Literary translation remains one of the most challenging frontiers in machine translation due to the complexity of capturing figurative language, cultural nuances, and unique stylistic elements. In this work, we introduce TransAgents, a novel multi-agent framework that simulates the roles and collaborative practices of a human translation company, including a CEO, Senior Editor, Junior Editor, Translator, Localization Specialist, and Proofreader. The translation process is divided into two stages: a preparation stage where the team is assembled and comprehensive translation guidelines are drafted, and an execution stage that involves sequential translation, localization, proofreading, and a final quality check. Furthermore, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP), which evaluates translations based solely on target language quality and cultural appropriateness, and Bilingual LLM Preference (BLP), which leverages large language models like GPT-4} for direct text comparison. Although TransAgents achieves lower d-BLEU scores, due to the limited diversity of references, its translations are significantly better than those of other baselines and are preferred by both human evaluators and LLMs over traditional human references and GPT-4} translations. Our findings highlight the potential of multi-agent collaboration in enhancing translation quality, particularly for longer texts.
Problem

Research questions and friction points this paper is trying to address.

Addressing challenges in literary machine translation of figurative language and cultural nuances
Proposing a multi-agent framework simulating human translation company roles
Introducing novel evaluation strategies for translation quality and cultural appropriateness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework simulating human translation company roles
Two-stage process: preparation and execution stages
Novel evaluation strategies: MHP and BLP metrics
🔎 Similar Papers
No similar papers found.
M
Minghao Wu
Monash University
Y
Yulin Yuan
University of Macau
G
Gholamreza Haffari
Monash University
Longyue Wang
Longyue Wang
Alibaba International
Large Language ModelMachine TranslationNatural Language ProcessingLanguange Agent