🤖 AI Summary
Cross-topic automated essay scoring (AES) suffers from poor generalization due to topic discrepancies; existing approaches predominantly model topic-shared features while neglecting topic-specific ones, hindering accurate assessment of critical dimensions such as topical consistency. To address this, we propose a topic-aware adversarial prompt-tuning framework: for the first time, it jointly models learnable shared prompts and topic-specific prompts, and introduces a neighborhood classifier to generate pseudo-labels for supervising topic-specific prompt learning. The framework integrates pretrained language models, adversarial distribution alignment, a unified regression-classification objective, and prompt-tuning techniques. Evaluated on the ASAP++ benchmark, our method significantly outperforms current state-of-the-art methods in both overall scoring accuracy and multi-dimensional performance (e.g., coherence, development, grammar), demonstrating superior robustness and practical applicability.
📝 Abstract
Cross-topic automated essay scoring (AES) aims to develop a transferable model capable of effectively evaluating essays on a target topic. A significant challenge in this domain arises from the inherent discrepancies between topics. While existing methods predominantly focus on extracting topic-shared features through distribution alignment of source and target topics, they often neglect topic-specific features, limiting their ability to assess critical traits such as topic adherence. To address this limitation, we propose an Adversarial TOpic-aware Prompt-tuning (ATOP), a novel method that jointly learns topic-shared and topic-specific features to improve cross-topic AES. ATOP achieves this by optimizing a learnable topic-aware prompt--comprising both shared and specific components--to elicit relevant knowledge from pre-trained language models (PLMs). To enhance the robustness of topic-shared prompt learning and mitigate feature scale sensitivity introduced by topic alignment, we incorporate adversarial training within a unified regression and classification framework. In addition, we employ a neighbor-based classifier to model the local structure of essay representations and generate pseudo-labels for target-topic essays. These pseudo-labels are then used to guide the supervised learning of topic-specific prompts tailored to the target topic. Extensive experiments on the publicly available ASAP++ dataset demonstrate that ATOP significantly outperforms existing state-of-the-art methods in both holistic and multi-trait essay scoring. The implementation of our method is publicly available at: https://anonymous.4open.science/r/ATOP-A271.