🤖 AI Summary
Non-English developers face significant barriers in contributing to open-source projects due to the predominance of English-language technical documentation. To address this, we propose TRIFID, the first framework enabling automated, structure-aware quantitative evaluation of translations involving mixed-content artifacts—natural language, source code, URLs, and Markdown—and supporting translation-aware continuous integration. We conduct English-to-German README translation experiments using ChatGPT-4 and Claude, validating results through both human evaluation and automated metrics. Our findings reveal that while large language models (LLMs) achieve high semantic fidelity, preserving formatting and code structure remains challenging; moreover, community-provided human translations are scarce and heavily concentrated among top-tier projects. This work empirically validates the feasibility of LLM-driven automated documentation internationalization, establishing a reproducible methodology and a standardized benchmark for multilingual support in open-source ecosystems.
📝 Abstract
While open source communities attract diverse contributors globally, few repositories provide essential documentation in languages other than English. Large language models (LLMs) have demonstrated remarkable capabilities in software engineering tasks and translations across domains. However, little is known about LLM capabilities in translating open-source technical documentation, which mixes natural language, code, URLs, and markdown formatting. To understand the need and potential for LLMs in technical documentation translation, we evaluated community translation activity and English-to-German translations of 50 README files using OpenAI's ChatGPT 4 and Anthropic's Claude. We found scarce translation activity, mostly in larger repositories and community-driven in nature. LLM performance comparison suggests they can provide accurate translations. However, analysis revealed fidelity challenges: both models struggled to preserve structural components (e.g., hyperlinks) and exhibited formatting inconsistencies. These findings highlight both promise and challenges of LLM-assisted documentation internationalization. As a first step toward translation-aware continuous integration pipelines, we introduce TRIFID, an early-stage translation fidelity scoring framework that automatically checks how well translations preserve code, links, and formatting. Our efforts provide a foundation for automated LLM-driven support for creating and maintaining open source documentation.