🤖 AI Summary
To address the limitations of general-purpose large language models (LLMs) in programming education—including insufficient cognitive modeling, poor adaptability to diverse learning styles, and inability to deliver real-time, context-aware feedback—this paper proposes a skill-hierarchy-aware retrieval-augmented generation (RAG) prompting framework. The method integrates programmable skill graph modeling with domain-knowledge-driven prompt engineering to enable automated student proficiency classification and dynamically adaptive feedback generation. Its key innovation lies in aligning pedagogical intent, fine-grained skill granularity, and LLM generation processes—thereby overcoming the semantic-pedagogical misalignment inherent in conventional RAG approaches for educational applications. Experimental evaluation across three programming task types demonstrates 92.3% accuracy in student skill-level identification, a 23% improvement in feedback readability, sub-1.8-second average response latency, and a 41% increase in feedback depth over baseline methods.
📝 Abstract
Recent advancements in artificial intelligence (AI) and machine learning have reignited interest in their impact on Computer-based Learning (CBL). AI-driven tools like ChatGPT and Intelligent Tutoring Systems (ITS) have enhanced learning experiences through personalisation and flexibility. ITSs can adapt to individual learning needs and provide customised feedback based on a student's performance, cognitive state, and learning path. Despite these advances, challenges remain in accommodating diverse learning styles and delivering real-time, context-aware feedback. Our research aims to address these gaps by integrating skill-aligned feedback via Retrieval Augmented Generation (RAG) into prompt engineering for Large Language Models (LLMs) and developing an application to enhance learning through personalised tutoring in a computer science programming context. The pilot study evaluated a proposed system using three quantitative metrics: readability score, response time, and feedback depth, across three programming tasks of varying complexity. The system successfully sorted simulated students into three skill-level categories and provided context-aware feedback. This targeted approach demonstrated better effectiveness and adaptability compared to general methods.