Enhancing Student Learning with LLM-Generated Retrieval Practice Questions: An Empirical Study in Data Science Courses

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the time-intensive and inefficient manual creation of retrieval practice questions in technical disciplines, this study proposes a pedagogical enhancement method leveraging large language models (LLMs) to automatically generate high-quality multiple-choice questions. Customized prompt engineering and rigorous human validation ensure question validity and reliability. Crucially, this work conducts the first controlled empirical study deploying LLM-generated retrieval questions in an authentic data science course. Results show that students practicing with LLM-generated questions achieved an 89% knowledge retention accuracy—significantly higher than the 73% observed in the no-practice control group (p < 0.01). The primary contribution lies in providing the first empirical evidence—within a natural instructional setting—that LLM-generated retrieval questions enhance long-term memory retention. This advances scalable, sustainable development of intelligent educational resources by establishing both empirical grounding and a practical implementation framework.

Technology Category

Application Category

📝 Abstract
Retrieval practice is a well-established pedagogical technique known to significantly enhance student learning and knowledge retention. However, generating high-quality retrieval practice questions is often time-consuming and labor intensive for instructors, especially in rapidly evolving technical subjects. Large Language Models (LLMs) offer the potential to automate this process by generating questions in response to prompts, yet the effectiveness of LLM-generated retrieval practice on student learning remains to be established. In this study, we conducted an empirical study involving two college-level data science courses, with approximately 60 students. We compared learning outcomes during one week in which students received LLM-generated multiple-choice retrieval practice questions to those from a week in which no such questions were provided. Results indicate that students exposed to LLM-generated retrieval practice achieved significantly higher knowledge retention, with an average accuracy of 89%, compared to 73% in the week without such practice. These findings suggest that LLM-generated retrieval questions can effectively support student learning and may provide a scalable solution for integrating retrieval practice into real-time teaching. However, despite these encouraging outcomes and the potential time-saving benefits, cautions must be taken, as the quality of LLM-generated questions can vary. Instructors must still manually verify and revise the generated questions before releasing them to students.
Problem

Research questions and friction points this paper is trying to address.

Automating retrieval practice question generation using LLMs
Evaluating effectiveness of LLM-generated questions on learning
Ensuring quality of automated questions for student use
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-generated multiple-choice questions enhance learning
Automated retrieval practice saves instructor time
Manual verification ensures question quality