🤖 AI Summary
Existing automated essay scoring (AES) research predominantly focuses on holistic scores, neglecting fine-grained, cross-topic assessment of writing traits (e.g., logical coherence, lexical richness).
Method: This paper introduces the first rubric-based, trait-specific AES framework, uniquely integrating LLM-driven, trait-oriented evaluation question generation with classical regression modeling. Leveraging prompt engineering, the LLM generates trait-specific assessment questions to extract transferable, trait-level features; a regression module then predicts dimension-wise scores.
Contribution/Results: Our method achieves state-of-the-art performance across all trait dimensions on mainstream benchmarks. LLM-generated trait features exhibit the highest contribution to scoring accuracy, significantly enhancing cross-topic generalization, robustness against topic shift, and model interpretability—addressing critical limitations of prior holistic and trait-agnostic approaches.
📝 Abstract
Research on holistic Automated Essay Scoring (AES) is long-dated; yet, there is a notable lack of attention for assessing essays according to individual traits. In this work, we propose TRATES, a novel trait-specific and rubric-based cross-prompt AES framework that is generic yet specific to the underlying trait. The framework leverages a Large Language Model (LLM) that utilizes the trait grading rubrics to generate trait-specific features (represented by assessment questions), then assesses those features given an essay. The trait-specific features are eventually combined with generic writing-quality and prompt-specific features to train a simple classical regression model that predicts trait scores of essays from an unseen prompt. Experiments show that TRATES achieves a new state-of-the-art performance across all traits on a widely-used dataset, with the generated LLM-based features being the most significant.