🤖 AI Summary
This study addresses the Indic Multilingual Grammatical Error Correction (IndicGEC) task, systematically evaluating correction performance across five low-resource Indian languages—Telugu, Hindi, Tamil, Malayalam, and Bengali. To overcome limitations of existing datasets (e.g., inconsistent quality) and evaluation metrics (e.g., GLEU’s inadequate adaptation to Indian scripts), we propose a lightweight, prompt-based approach leveraging zero-shot and few-shot prompting with 4B-parameter to large proprietary language models in cross-lingual experiments. Key contributions include: (1) demonstrating the untapped potential of smaller models for low-resource Indian languages; (2) underscoring the critical need for high-quality, domain-specific annotated data; and (3) advocating script-aware evaluation metrics. On BHASHA Task 1, our method achieves top-tier results—4th place in Telugu (GLEU = 83.78) and 2nd place in Hindi (GLEU = 84.31)—validating the effectiveness and generalizability of prompt-driven paradigms for multiscipt, low-resource grammatical error correction.
📝 Abstract
In this paper, we report the results of the TeamNRC's participation in the BHASHA-Task 1 Grammatical Error Correction shared task https://github.com/BHASHA-Workshop/IndicGEC2025/ for 5 Indian languages. Our approach, focusing on zero/few-shot prompting of language models of varying sizes (4B to large proprietary models) achieved a Rank 4 in Telugu and Rank 2 in Hindi with GLEU scores of 83.78 and 84.31 respectively. In this paper, we extend the experiments to the other three languages of the shared task - Tamil, Malayalam and Bangla, and take a closer look at the data quality and evaluation metric used. Our results primarily highlight the potential of small language models, and summarize the concerns related to creating good quality datasets and appropriate metrics for this task that are suitable for Indian language scripts.