🤖 AI Summary
This study addresses the challenges of analyzing debugging behaviors and delivering personalized feedback in programming education. Methodologically, it proposes an iterative debugging analysis framework based on edit representation modeling: (1) It constructs continuous edit embeddings from code submission sequences to dynamically model students’ error-correction processes and knowledge-state evolution; (2) It explicitly incorporates test-case feedback signals into LLM fine-tuning—first of its kind—to generate stylistically consistent and semantically accurate, personalized repair suggestions; (3) It integrates an encoder-decoder architecture with edit-behavior-driven clustering to identify frequent debugging trajectories and prototypical error patterns. Evaluated on a real-world student dataset, the approach significantly improves code reconstruction accuracy and suggestion relevance, while uncovering interpretable, recurrent debugging regularities. These results establish a novel paradigm for intelligent programming tutoring systems.
📝 Abstract
Providing effective feedback for programming assignments in computer science education can be challenging: students solve problems by iteratively submitting code, executing it, and using limited feedback from the compiler or the auto-grader to debug. Analyzing student debugging behavior in this process may reveal important insights into their knowledge and inform better personalized support tools. In this work, we propose an encoder-decoder-based model that learns meaningful code-edit embeddings between consecutive student code submissions, to capture their debugging behavior. Our model leverages information on whether a student code submission passes each test case to fine-tune large language models (LLMs) to learn code editing representations. It enables personalized next-step code suggestions that maintain the student's coding style while improving test case correctness. Our model also enables us to analyze student code-editing patterns to uncover common student errors and debugging behaviors, using clustering techniques. Experimental results on a real-world student code submission dataset demonstrate that our model excels at code reconstruction and personalized code suggestion while revealing interesting patterns in student debugging behavior.