๐ค AI Summary
Large language models (LLMs) exhibit imbalanced performance in machine translation (MT) across language families and domains, particularly amplifying societal biases present in training data for low-resource languagesโthereby compromising translation fairness. To address this, we propose Translation Tangles, a hybrid bias detection framework integrating rule-based heuristics, semantic similarity filtering, and LLM-based validation. We introduce the first high-quality, human-annotated dataset for MT fairness evaluation, comprising 1,439 translation-reference pairs across 24 bilingual directions and diverse domains. Furthermore, we design a unified evaluation paradigm that jointly incorporates multi-metric benchmarking, semantic similarity computation, LLM-based verification, and human assessment. All code, data, and evaluation tools are publicly released to foster reproducible research and community advancement in fair MT.
๐ Abstract
The rise of Large Language Models (LLMs) has redefined Machine Translation (MT), enabling context-aware and fluent translations across hundreds of languages and textual domains. Despite their remarkable capabilities, LLMs often exhibit uneven performance across language families and specialized domains. Moreover, recent evidence reveals that these models can encode and amplify different biases present in their training data, posing serious concerns for fairness, especially in low-resource languages. To address these gaps, we introduce Translation Tangles, a unified framework and dataset for evaluating the translation quality and fairness of open-source LLMs. Our approach benchmarks 24 bidirectional language pairs across multiple domains using different metrics. We further propose a hybrid bias detection pipeline that integrates rule-based heuristics, semantic similarity filtering, and LLM-based validation. We also introduce a high-quality, bias-annotated dataset based on human evaluations of 1,439 translation-reference pairs. The code and dataset are accessible on GitHub: https://github.com/faiyazabdullah/TranslationTangles