Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Small language models (SLMs) exhibit significantly weaker performance on French compared to English, and efficient methods for French adaptation remain scarce. Method: We introduce the Luth model family, which achieves bilingual English–French enhancement via targeted post-training on high-quality French data, coupled with parameter merging and cross-lingual knowledge distillation—without compromising English capabilities. Our approach avoids costly multilingual pretraining, instead focusing on precise French capability injection and efficient utilization of model capacity. Results: Luth surpasses same-scale open-source models across multiple French benchmarks (e.g., FLUE, FrenchBertEval), establishing a new state-of-the-art for small models in French; English performance remains stable or slightly improves. This work provides a reproducible, cost-effective methodology for building resource-constrained multilingual SLMs and sets a new baseline for French SLMs.

Technology Category

Application Category

📝 Abstract
The landscape of Large Language Models (LLMs) remains predominantly English-centric, resulting in a significant performance gap for other major languages, such as French, especially in the context of Small Language Models (SLMs). Existing multilingual models demonstrate considerably lower performance in French compared to English, and research on efficient adaptation methods for French remains limited. To address this, we introduce extbf{Luth}, a family of French-specialized SLMs: through targeted post-training on curated, high-quality French data, our models outperform all open-source counterparts of comparable size on multiple French benchmarks while retaining their original English capabilities. We further show that strategic model merging enhances performance in both languages, establishing Luth as a new state of the art for French SLMs and a robust baseline for future French-language research.
Problem

Research questions and friction points this paper is trying to address.

Addressing French performance gap in small language models
Improving multilingual model efficiency for French language tasks
Developing specialized French SLMs while preserving English capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

French-specialized SLMs via targeted post-training
Strategic model merging boosts bilingual performance
Curated high-quality French data enhances capabilities
🔎 Similar Papers
No similar papers found.
Maxence Lasbordes
Maxence Lasbordes
ENS-Ulm / Dauphine / Télécom SudParis
IANLPLLMMachine Learning
S
Sinoué Gad
École Polytechnique, Télécom SudParis