Zero-Shot Grammar Competency Estimation Using Large Language Model Generated Pseudo Labels

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Expert annotations for grammatical proficiency assessment are scarce—especially for spontaneous, disfluent spoken language—hindering reliable evaluation. Method: This paper proposes a zero-shot pseudo-labeling framework that leverages scoring rubrics to guide large language models (LLMs) in generating high-quality pseudo-labels. It integrates a noise-robust training mechanism and a score-consistency optimization strategy within a Transformer-based architecture to ensure robust modeling. Contribution/Results: To our knowledge, this is the first work to jointly employ structured prompt-driven pseudo-label generation and noise-robust learning for grammatical proficiency estimation—requiring no human annotations and supporting both spoken and written modalities. Experiments demonstrate substantial improvements over supervised baselines in low-resource settings, achieving high accuracy, strong interpretability (via rubric-aligned outputs), and practical applicability for real-world language assessment.

Technology Category

Application Category

📝 Abstract
Grammar competency estimation is essential for assessing linguistic proficiency in both written and spoken language; however, the spoken modality presents additional challenges due to its spontaneous, unstructured, and disfluent nature. Developing accurate grammar scoring models further requires extensive expert annotation, making large-scale data creation impractical. To address these limitations, we propose a zero-shot grammar competency estimation framework that leverages unlabeled data and Large Language Models (LLMs) without relying on manual labels. During training, we employ LLM-generated predictions on unlabeled data by using grammar competency rubric-based prompts. These predictions, treated as pseudo labels, are utilized to train a transformer-based model through a novel training framework designed to handle label noise effectively. We show that the choice of LLM for pseudo-label generation critically affects model performance and that the ratio of clean-to-noisy samples during training strongly influences stability and accuracy. Finally, a qualitative analysis of error intensity and score prediction confirms the robustness and interpretability of our approach. Experimental results demonstrate the efficacy of our approach in estimating grammar competency scores with high accuracy, paving the way for scalable, low-resource grammar assessment systems.
Problem

Research questions and friction points this paper is trying to address.

Estimating grammar competency in spontaneous spoken language without manual annotations
Developing accurate grammar scoring models with limited expert-labeled training data
Creating scalable grammar assessment systems using LLM-generated pseudo labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLM-generated pseudo labels for training
Employing transformer model with noise-handling framework
Leveraging rubric-based prompts for grammar assessment
🔎 Similar Papers
No similar papers found.