Exposía: Academic Writing Assessment of Exposés and Peer Feedback

📅 2026-01-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses a critical gap in educational automatic scoring research—the lack of publicly available datasets linking academic writing with feedback evaluation. To bridge this gap, the authors construct the first multi-stage dataset for higher education that jointly captures student research proposals, iterative peer feedback, and pedagogically grounded, fine-grained human rubrics. Leveraging this resource, they systematically evaluate open-source large language models (LLMs) on automated scoring tasks. Results indicate strong alignment between model and human scores on dimensions requiring minimal domain knowledge, but performance declines on content depth. Models also exhibit a bias toward aligning more closely with high-scoring instructors. A multi-dimensional joint prompting strategy yields the best results, demonstrating particular suitability for classroom deployment. This work contributes both a novel data resource and an evaluation framework to advance AI in education.

Technology Category

Application Category

📝 Abstract
We present Expos\'ia, the first public dataset that connects writing and feedback assessment in higher education, enabling research on educationally grounded approaches to academic writing evaluation. Expos\'ia includes student research project proposals and peer and instructor feedback consisting of comments and free-text reviews. The dataset was collected in the"Introduction to Scientific Work"course of the Computer Science undergraduate program that focuses on teaching academic writing skills and providing peer feedback on academic writing. Expos\'ia reflects the multi-stage nature of the academic writing process that includes drafting, providing and receiving feedback, and revising the writing based on the feedback received. Both the project proposals and peer feedback are accompanied by human assessment scores based on a fine-grained, pedagogically-grounded schema for writing and feedback assessment that we develop. We use Expos\'ia to benchmark state-of-the-art open-source large language models (LLMs) for two tasks: automated scoring of (1) the proposals and (2) the student reviews. The strongest LLMs attain high agreement on scoring aspects that require little domain knowledge but degrade on dimensions evaluating content, in line with human agreement values. We find that LLMs align better with the human instructors giving high scores. Finally, we establish that a prompting strategy that scores multiple aspects of the writing together is the most effective, an important finding for classroom deployment.
Problem

Research questions and friction points this paper is trying to address.

academic writing assessment
peer feedback
exposé
automated scoring
higher education
Innovation

Methods, ideas, or system contributions that make the work stand out.

academic writing assessment
peer feedback
large language models
educational dataset
automated scoring
🔎 Similar Papers
No similar papers found.
D
Dennis Zyska
Ubiquitous Knowledge Processing Lab (UKP Lab), Department of Computer Science and Hessian Center for AI (hessian.AI), Technical University of Darmstadt
A
Alla Rozovskaya
Department of Computer Science at Queens College, City University of New York (CUNY)
Ilia Kuznetsov
Ilia Kuznetsov
UKP Lab, TU Darmstadt
natural language processingscholarly AIpeer reviewintertextualityinterpretability
Iryna Gurevych
Iryna Gurevych
Full Professor, TU Darmstadt; Adjunct Professor, MBZUAI, UAE; Affiliated Professor, INSAIT, Bulgaria
Natural Language ProcessingLarge Language ModelsArtificial Intelligence