🤖 AI Summary
This work investigates whether multilingual neural machine translation (NMT) induces a “multilingual curse” or enables beneficial cross-lingual knowledge transfer within the Slavic language family. Addressing zero-shot translation for low-resource Slavic languages, we propose a language-family-aware multilingual modeling paradigm built upon the Transformer architecture, incorporating family-guided data augmentation, shared subword tokenization, and zero-shot joint training. We present the first systematic empirical validation of effective cross-lingual transfer among Slavic languages, substantially mitigating the multilingual curse. Our approach achieves state-of-the-art zero-shot translation performance across multiple low-resource Slavic language pairs. Furthermore, we release the first open-source, CC BY 4.0–licensed collection of Slavic-specific multilingual NMT models on the Hugging Face Hub, enabling plug-and-play deployment.
📝 Abstract
Does multilingual Neural Machine Translation (NMT) lead to The Curse of the Multlinguality or provides the Cross-lingual Knowledge Transfer within a language family? In this study, we explore multiple approaches for extending the available data-regime in NMT and we prove cross-lingual benefits even in 0-shot translation regime for low-resource languages. With this paper, we provide state-of-the-art open-source NMT models for translating between selected Slavic languages. We released our models on the HuggingFace Hub (https://hf.co/collections/allegro/multislav-6793d6b6419e5963e759a683) under the CC BY 4.0 license. Slavic language family comprises morphologically rich Central and Eastern European languages. Although counting hundreds of millions of native speakers, Slavic Neural Machine Translation is under-studied in our opinion. Recently, most NMT research focuses either on: high-resource languages like English, Spanish, and German - in WMT23 General Translation Task 7 out of 8 task directions are from or to English; massively multilingual models covering multiple language groups; or evaluation techniques.