🤖 AI Summary
Students commonly hold misconceptions about core programming concepts, impeding learning efficiency and code quality. This paper introduces the McMining task—the first systematic study of automatically mining such misconceptions from student code. Methodologically, we construct a scalable misconception benchmark dataset and design two LLM-driven mining frameworks: one leveraging Gemini and Claude for semantic parsing, and another using GPT-series models for pattern induction. Our contributions are threefold: (1) a formal definition and task formulation of “programming misconception mining”; (2) the first publicly released, structured misconception benchmark; and (3) an empirical analysis revealing the capabilities, limitations, and optimization pathways of state-of-the-art LLMs on this task. Experiments demonstrate that our approach effectively identifies prevalent cognitive biases, enabling novel paradigms for personalized programming feedback and pedagogical intervention.
📝 Abstract
When learning to code, students often develop misconceptions about various programming language concepts. These can not only lead to bugs or inefficient code, but also slow down the learning of related concepts. In this paper, we introduce McMining, the task of mining programming misconceptions from samples of code from a student. To enable the training and evaluation of McMining systems, we develop an extensible benchmark dataset of misconceptions together with a large set of code samples where these misconceptions are manifested. We then introduce two LLM-based McMiner approaches and through extensive evaluations show that models from the Gemini, Claude, and GPT families are effective at discovering misconceptions in student code.