GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation

📅 2025-07-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing document-level machine translation (DMT) struggles to model authentic discourse structures: heuristic segmentation decouples sentence/segment boundaries from semantic dependencies, resulting in poor cross-sentence and cross-paragraph coherence. To address this, we propose a graph-structured multi-agent framework that explicitly models document-level semantic flow via a directed acyclic graph (DAG). Leveraging large language models (LLMs), collaborative agents perform dynamic segmentation, discourse dependency identification, and context-aware translation—enabling end-to-end discourse-aware modeling. The framework integrates graph neural networks (GNNs) with dynamic context aggregation, eliminating reliance on hand-crafted rules and associated biases. Extensive experiments across eight translation directions and six domains demonstrate consistent improvements: +2.8 dBLEU on average over the TED test set, including +2.3 dBLEU for English–Chinese. Results confirm substantial gains in both coherence and accuracy of translated documents.

Technology Category

Application Category

📝 Abstract
Document level Machine Translation (DocMT) approaches often struggle with effectively capturing discourse level phenomena. Existing approaches rely on heuristic rules to segment documents into discourse units, which rarely align with the true discourse structure required for accurate translation. Otherwise, they fail to maintain consistency throughout the document during translation. To address these challenges, we propose Graph Augmented Agentic Framework for Document Level Translation (GRAFT), a novel graph based DocMT system that leverages Large Language Model (LLM) agents for document translation. Our approach integrates segmentation, directed acyclic graph (DAG) based dependency modelling, and discourse aware translation into a cohesive framework. Experiments conducted across eight translation directions and six diverse domains demonstrate that GRAFT achieves significant performance gains over state of the art DocMT systems. Specifically, GRAFT delivers an average improvement of 2.8 d BLEU on the TED test sets from IWSLT2017 over strong baselines and 2.3 d BLEU for domain specific translation from English to Chinese. Moreover, our analyses highlight the consistent ability of GRAFT to address discourse level phenomena, yielding coherent and contextually accurate translations.
Problem

Research questions and friction points this paper is trying to address.

Capturing discourse phenomena in document translation
Aligning segmentation with true discourse structure
Maintaining consistency across document translations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based framework for document translation
LLM agents enhance translation accuracy
DAG dependency modeling for discourse coherence
🔎 Similar Papers
No similar papers found.
H
Himanshu Dutta
Indian Institute of Technology Bombay, India
Sunny Manchanda
Sunny Manchanda
DYSL-AI
Machine Learning
P
Prakhar Bapat
Indian Institute of Technology Bombay, India
M
Meva Ram Gurjar
DYSL-AI, DRDO, India
P
Pushpak Bhattacharyya
Indian Institute of Technology Bombay, India