IndicGEC: Powerful Models, or a Measurement Mirage?

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the Indic Multilingual Grammatical Error Correction (IndicGEC) task, systematically evaluating correction performance across five low-resource Indian languages—Telugu, Hindi, Tamil, Malayalam, and Bengali. To overcome limitations of existing datasets (e.g., inconsistent quality) and evaluation metrics (e.g., GLEU’s inadequate adaptation to Indian scripts), we propose a lightweight, prompt-based approach leveraging zero-shot and few-shot prompting with 4B-parameter to large proprietary language models in cross-lingual experiments. Key contributions include: (1) demonstrating the untapped potential of smaller models for low-resource Indian languages; (2) underscoring the critical need for high-quality, domain-specific annotated data; and (3) advocating script-aware evaluation metrics. On BHASHA Task 1, our method achieves top-tier results—4th place in Telugu (GLEU = 83.78) and 2nd place in Hindi (GLEU = 84.31)—validating the effectiveness and generalizability of prompt-driven paradigms for multiscipt, low-resource grammatical error correction.

Technology Category

Application Category

📝 Abstract
In this paper, we report the results of the TeamNRC's participation in the BHASHA-Task 1 Grammatical Error Correction shared task https://github.com/BHASHA-Workshop/IndicGEC2025/ for 5 Indian languages. Our approach, focusing on zero/few-shot prompting of language models of varying sizes (4B to large proprietary models) achieved a Rank 4 in Telugu and Rank 2 in Hindi with GLEU scores of 83.78 and 84.31 respectively. In this paper, we extend the experiments to the other three languages of the shared task - Tamil, Malayalam and Bangla, and take a closer look at the data quality and evaluation metric used. Our results primarily highlight the potential of small language models, and summarize the concerns related to creating good quality datasets and appropriate metrics for this task that are suitable for Indian language scripts.
Problem

Research questions and friction points this paper is trying to address.

Developing grammatical error correction models for five Indian languages
Evaluating zero/few-shot prompting with language models of varying sizes
Addressing data quality and evaluation metric concerns for Indian scripts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero/few-shot prompting of language models
Testing models from 4B to large proprietary sizes
Analyzing data quality and evaluation metrics
🔎 Similar Papers
No similar papers found.