Towards Prompt Generalization: Grammar-aware Cross-Prompt Automated Essay Scoring

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current cross-prompt automated essay scoring (AES) suffers from poor generalization to unseen prompts, primarily because existing models rely heavily on prompt-specific essay-score pairs and thus fail to learn prompt-invariant, generalizable representations. To address this, we propose Grammar-Aware Prompt-Specific trait Scoring (GAPS), the first AES framework to explicitly incorporate grammatical error correction (GEC) signals. GAPS employs a dual-encoder architecture that jointly encodes both the original essay and its GEC-corrected version, enabling syntactic-level disentanglement of prompt-dependent features and facilitating learning of prompt-independent essay representations. Under rigorous cross-prompt evaluation, GAPS achieves significant gains in weighted quadratic kappa (QWK), particularly improving accuracy on grammar- and vocabulary-related scoring dimensions. Overall, it surpasses state-of-the-art methods in cross-prompt generalization performance.

Technology Category

Application Category

📝 Abstract
In automated essay scoring (AES), recent efforts have shifted toward cross-prompt settings that score essays on unseen prompts for practical applicability. However, prior methods trained with essay-score pairs of specific prompts pose challenges in obtaining prompt-generalized essay representation. In this work, we propose a grammar-aware cross-prompt trait scoring (GAPS), which internally captures prompt-independent syntactic aspects to learn generic essay representation. We acquire grammatical error-corrected information in essays via the grammar error correction technique and design the AES model to seamlessly integrate such information. By internally referring to both the corrected and the original essays, the model can focus on generic features during training. Empirical experiments validate our method's generalizability, showing remarkable improvements in prompt-independent and grammar-related traits. Furthermore, GAPS achieves notable QWK gains in the most challenging cross-prompt scenario, highlighting its strength in evaluating unseen prompts.
Problem

Research questions and friction points this paper is trying to address.

Automated essay scoring across unseen prompts
Grammar-aware generic essay representation learning
Integrating grammar error correction in AES models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Grammar-aware cross-prompt trait scoring
Integrates grammar error correction technique
Focuses on generic essay representation
🔎 Similar Papers
No similar papers found.
Heejin Do
Heejin Do
Postdoctoral Fellow, ETH Zurich, ETH AI Center
NLPAI in EducationEvaluationHuman-AI InteractionInterpretability
T
Taehee Park
Graduate School of Artificial Intelligence, POSTECH, Republic of Korea
Sangwon Ryu
Sangwon Ryu
POSTECH
Natural Language ProcessingText SummarizationReinforcement LearningLarge Language Models
G
Gary Geunbae Lee
Graduate School of Artificial Intelligence, POSTECH, Republic of Korea; Department of Computer Science and Engineering, POSTECH, Republic of Korea