🤖 AI Summary
To address the challenges of observing and assessing learners’ cognitive states in VR-based education, this paper proposes the first intelligent learning-monitoring and adaptive-response framework for VR online learning. Methodologically: (1) it establishes a novel paradigm for modeling learning comprehension in VR environments using facial biometrics—including micro-expressions and gaze trajectories; (2) it jointly analyzes facial signals and learning metadata by integrating Item Response Theory with temporal modeling (TCN + MLP); and (3) it develops a VR-native interactive toolkit supporting highlighting, annotation, and real-time feedback. Contributions include: (i) releasing the first multimodal VR learning dataset with comprehension labels, concept annotations, and item difficulty scores (25+ hours, 10 participants); (ii) achieving statistically significant improvements in comprehension-state detection accuracy on a TOEIC VR assessment task; and (iii) open-sourcing all data and baseline models.
📝 Abstract
This work introduces SMARTe-VR, a platform for student monitoring in an immersive virtual reality environment designed for online education. SMARTe-VR is aimed to gather data for adaptive learning, focusing on facial biometrics and learning metadata. The platform allows instructors to create tailored learning sessions with video lectures, featuring an interface with an Auto QA system to evaluate understanding, interaction tools (e.g., textbook highlighting and lecture tagging), and real-time feedback. Additionally, we release a dataset containing 5 research challenges with data from 10 users in VR-based TOEIC sessions. This dataset, spanning over 25 hours, includes facial features, learning metadata, 450 responses, question difficulty levels, concept tags, and understanding labels. Alongside the database, we present preliminary experiments using Item Response Theory models, adapted for understanding detection using facial features. Two architectures were explored: a Temporal Convolutional Network for local features and a Multilayer Perceptron for global features.