Mark My Works Autograder for Programming Courses

📅 2026-01-15

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study addresses the challenge of delivering timely and personalized feedback in large-scale programming courses. To this end, the authors propose a locally deployed automated grading system that uniquely integrates role-based prompt engineering with large language models (LLMs). The system validates functional correctness through unit tests and leverages LLMs to generate interpretable, pedagogically oriented feedback on code quality while maintaining transparency in its reasoning process. In a pilot deployment involving 191 students, the AI-generated scores showed no significant linear correlation with human grades (r = −0.177) but exhibited a similar distribution shape. Although the AI scoring was notably more conservative (mean = 59.95 vs. 80.53), it substantially outperformed human graders in coverage of technical details and depth of feedback.

Technology Category

Application Category

📝 Abstract

Large programming courses struggle to provide timely, detailed feedback on student code. We developed Mark My Works, a local autograding system that combines traditional unit testing with LLM-generated explanations. The system uses role-based prompts to analyze submissions, critique code quality, and generate pedagogical feedback while maintaining transparency in its reasoning process. We piloted the system in a 191-student engineering course, comparing AI-generated assessments with human grading on 79 submissions. While AI scores showed no linear correlation with human scores (r = -0.177, p = 0.124), both systems exhibited similar left-skewed distributions, suggesting they recognize comparable quality hierarchies despite different scoring philosophies. The AI system demonstrated more conservative scoring (mean: 59.95 vs 80.53 human) but generated significantly more detailed technical feedback.

Problem

Research questions and friction points this paper is trying to address.

autograding

programming education

student feedback

code assessment

large-scale courses

Innovation

Methods, ideas, or system contributions that make the work stand out.

autograding

LLM-generated feedback

role-based prompting