Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This study investigates the feasibility of single lightweight large language models (LLMs) with ≤9B parameters for multilingual grammatical error correction (GEC) across English, German, Italian, and Swedish. We systematically evaluate 17 prominent open-source models—including Llama, Gemma, Phi, and Qwen—using zero-shot and few-shot prompting, and assess performance via three complementary metrics: BLEU, ERRANT, and human evaluation, emphasizing both correction accuracy and edit minimization. To our knowledge, this is the first comparative study of cross-lingual GEC using a unified lightweight model architecture. Results show Gemma-9B consistently outperforms all others across all four languages: it achieves an average 4.2 percentage-point higher correction accuracy than the second-best model and reduces edit distance by 23%. We identify six models satisfying stringent cross-lingual performance thresholds, demonstrating that ≤9B-parameter LLMs can deliver high-quality, low-disturbance multilingual GEC. Gemma-9B emerges as the current state-of-the-art lightweight solution.

Technology Category

Application Category

📝 Abstract

Recent language models can successfully solve various language-related tasks, and many understand inputs stated in different languages. In this paper, we explore the performance of 17 popular models used to correct grammatical issues in texts stated in English, German, Italian, and Swedish when using a single model to correct texts in all those languages. We analyze the outputs generated by these models, focusing on decreasing the number of grammatical errors while keeping the changes small. The conclusions drawn help us understand what problems occur among those models and which models can be recommended for multilingual grammatical error correction tasks. We list six models that improve grammatical correctness in all four languages and show that Gemma 9B is currently the best performing one for the languages considered.

Problem

Research questions and friction points this paper is trying to address.

Evaluating 17 models for multilingual grammatical error correction

Assessing single-model performance across English, German, Italian, Swedish

Identifying top-performing models like Gemma 9B for multilingual GEC

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single LLM for multilingual error correction

Comparative study of 17 models

Gemma 9B performs best

🔎 Similar Papers

No similar papers found.