Evaluating the Impact of AI-Powered Audiovisual Personalization on Learner Emotion, Focus, and Learning Outcomes

📅 2025-05-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Independent learners in unstructured environments frequently experience diminished attention and difficulties in emotion regulation, yet existing educational technologies overlook integrated sensory–affective interventions. This study introduces the first multimodal large language model (LLM) system for learning-oriented sensory context modeling. By fusing physiological signals—including eye-tracking and galvanic skin response—with mixed-methods research, the system dynamically generates personalized audiovisual environments (e.g., visual themes and auditory elements) to enable concurrent regulation of affective, perceptual, and cognitive states. It innovatively extends multimodal LLM generative capabilities to real-time learning context adaptation, addressing a critical gap in affective educational technology concerning sensory dimensions. Empirical evaluation demonstrates statistically significant reductions in subjective cognitive load (*p* < 0.01), a 37% increase in sustained attention duration, a 22% improvement in knowledge retention, and pronounced emotion-regulatory effects specifically among anxiety-prone learners.

Technology Category

Application Category

📝 Abstract
Independent learners often struggle with sustaining focus and emotional regulation in unstructured or distracting settings. Although some rely on ambient aids such as music, ASMR, or visual backgrounds to support concentration, these tools are rarely integrated into cohesive, learner-centered systems. Moreover, existing educational technologies focus primarily on content adaptation and feedback, overlooking the emotional and sensory context in which learning takes place. Large language models have demonstrated powerful multimodal capabilities including the ability to generate and adapt text, audio, and visual content. Educational research has yet to fully explore their potential in creating personalized audiovisual learning environments. To address this gap, we introduce an AI-powered system that uses LLMs to generate personalized multisensory study environments. Users select or generate customized visual themes (e.g., abstract vs. realistic, static vs. animated) and auditory elements (e.g., white noise, ambient ASMR, familiar vs. novel sounds) to create immersive settings aimed at reducing distraction and enhancing emotional stability. Our primary research question investigates how combinations of personalized audiovisual elements affect learner cognitive load and engagement. Using a mixed-methods design that incorporates biometric measures and performance outcomes, this study evaluates the effectiveness of LLM-driven sensory personalization. The findings aim to advance emotionally responsive educational technologies and extend the application of multimodal LLMs into the sensory dimension of self-directed learning.
Problem

Research questions and friction points this paper is trying to address.

AI-powered audiovisual personalization for learner focus and emotion
Integrating multisensory elements into learner-centered educational systems
Evaluating impact of personalized audiovisual environments on learning outcomes
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-powered system using LLMs for multisensory environments
Personalized audiovisual elements to reduce distraction
Mixed-methods design with biometric measures for evaluation
🔎 Similar Papers
No similar papers found.
G
George Xi Wang
New York University, USA
J
Jingying Deng
New York University, USA
Safinah Ali
Safinah Ali
Assistant Professor, New York University
Human Computer Interaction