Delving Into the Psychology of Machines: Exploring the Structure of Self-Regulated Learning via LLM-Generated Survey Responses

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the validity of large language models (LLMs) in generating responses to self-regulated learning (SRL) psychological scales—specifically the Motivated Strategies for Learning Questionnaire (MSLQ)—to assess their utility for theoretical validation and intervention design. Method: Responses were generated by five LLMs (GPT-4o, Claude 3.7 Sonnet, Gemini 2 Flash, LLaMA 3.1-8B, Mistral Large) and evaluated using an integrated validation framework combining confirmatory and exploratory factor analysis (CFA/EFA) with psychometric network analysis to jointly assess structural validity and theoretical alignment. Contribution/Results: Gemini 2 Flash demonstrated superior performance, successfully replicating MSLQ’s established factor structure and inter-dimensional relationships; however, it exhibited sampling instability. Overall, LLMs show promising capacity to simulate SRL-related psychological data, yet critical limitations in response consistency, dimensionality fidelity, and construct validity boundaries are revealed—highlighting both potential and current constraints for leveraging LLMs in SRL measurement and theory testing.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) offer the potential to simulate human-like responses and behaviors, creating new opportunities for psychological science. In the context of self-regulated learning (SRL), if LLMs can reliably simulate survey responses at scale and speed, they could be used to test intervention scenarios, refine theoretical models, augment sparse datasets, and represent hard-to-reach populations. However, the validity of LLM-generated survey responses remains uncertain, with limited research focused on SRL and existing studies beyond SRL yielding mixed results. Therefore, in this study, we examined LLM-generated responses to the 44-item Motivated Strategies for Learning Questionnaire (MSLQ; Pintrich &De Groot, 1990), a widely used instrument assessing students' learning strategies and academic motivation. Particularly, we used the LLMs GPT-4o, Claude 3.7 Sonnet, Gemini 2 Flash, LLaMA 3.1-8B, and Mistral Large. We analyzed item distributions, the psychological network of the theoretical SRL dimensions, and psychometric validity based on the latent factor structure. Our results suggest that Gemini 2 Flash was the most promising LLM, showing considerable sampling variability and producing underlying dimensions and theoretical relationships that align with prior theory and empirical findings. At the same time, we observed discrepancies and limitations, underscoring both the potential and current constraints of using LLMs for simulating psychological survey data and applying it in educational contexts.
Problem

Research questions and friction points this paper is trying to address.

Assessing validity of LLM-generated survey responses in psychology
Exploring LLM simulation of self-regulated learning behaviors
Evaluating psychometric properties of LLM-produced SRL data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using multiple LLMs for survey response simulation
Analyzing psychological networks of SRL dimensions
Validating psychometric properties of LLM-generated data
L
Leonie V.D.E. Vogelsmeier
Department of Methodology and Statistics, Tilburg University, The Netherlands
Eduardo Araujo Oliveira
Eduardo Araujo Oliveira
The University of Melbourne
learning analyticsstylometryai in educationsoftware engineering
K
Kamila Misiejuk
Center of Advanced Technology for Assisted Learning and Predictive Analytics (CATALPA), FernUniversität in Hagen, Germany
S
Sonsoles L'opez-Pernas
University of Eastern Finland, School of Computing, Joensuu, Yliopistokatu 2, 80100, Joensuu, Finland
Mohammed Saqr
Mohammed Saqr
Associate Professor, University of Eastern Finland
Learning analyticsArtificial IntelligenceIdiographic analyticsNetwork Science