A Generalized Apprenticeship Learning Framework for Capturing Evolving Student Pedagogical Strategies

📅 2026-02-24

🏛️ International Conference on Artificial Intelligence in Education

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of low sample efficiency and reward function design in deep reinforcement learning for intelligent tutoring systems, which hinder effective modeling of students’ dynamically evolving learning strategies. To overcome these limitations, the authors propose THEMES, a framework based on generalized apprenticeship learning that introduces, for the first time, a time-varying multidimensional reward function. By leveraging only 18 expert demonstration trajectories from historical semester data, THEMES jointly integrates inverse reinforcement learning and policy generalization to accurately capture the temporal complexity and non-stationarity of teaching strategies. Evaluated on the task of predicting subsequent-semester instructional decisions, the method achieves an AUC of 0.899 and a Jaccard index of 0.653, significantly outperforming six state-of-the-art baselines and demonstrating strong efficacy and generalization capability.

Technology Category

Application Category

📝 Abstract

Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL) have advanced rapidly in recent years and have been successfully applied to e-learning environments like intelligent tutoring systems (ITSs). Despite great success, the broader application of DRL to educational technologies has been limited due to major challenges such as sample inefficiency and difficulty designing the reward function. In contrast, Apprenticeship Learning (AL) uses a few expert demonstrations to infer the expert's underlying reward functions and derive decision-making policies that generalize and replicate optimal behavior. In this work, we leverage a generalized AL framework, THEMES, to induce effective pedagogical policies by capturing the complexities of the expert student learning process, where multiple reward functions may dynamically evolve over time. We evaluate the effectiveness of THEMES against six state-of-the-art baselines, demonstrating its superior performance and highlighting its potential as a powerful alternative for inducing effective pedagogical policies and show that it can achieve high performance, with an AUC of 0.899 and a Jaccard of 0.653, using only 18 trajectories of a previous semester to predict student pedagogical decisions in a later semester.

Problem

Research questions and friction points this paper is trying to address.

Apprenticeship Learning

Pedagogical Strategies

Reward Function

Intelligent Tutoring Systems

Student Modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Apprenticeship Learning

Pedagogical Policy

Reward Function Inference

Dynamic Student Modeling

THEMES Framework

🔎 Similar Papers

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review

2024-08-22arXiv.orgCitations: 4

Authors to Follow