Delving Into the Psychology of Machines: Exploring the Structure of Self-Regulated Learning via LLM-Generated Survey Responses

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study systematically evaluates the validity of large language models (LLMs) in generating responses to self-regulated learning (SRL) psychological scales—specifically the Motivated Strategies for Learning Questionnaire (MSLQ)—to assess their utility for theoretical validation and intervention design. Method: Responses were generated by five LLMs (GPT-4o, Claude 3.7 Sonnet, Gemini 2 Flash, LLaMA 3.1-8B, Mistral Large) and evaluated using an integrated validation framework combining confirmatory and exploratory factor analysis (CFA/EFA) with psychometric network analysis to jointly assess structural validity and theoretical alignment. Contribution/Results: Gemini 2 Flash demonstrated superior performance, successfully replicating MSLQ’s established factor structure and inter-dimensional relationships; however, it exhibited sampling instability. Overall, LLMs show promising capacity to simulate SRL-related psychological data, yet critical limitations in response consistency, dimensionality fidelity, and construct validity boundaries are revealed—highlighting both potential and current constraints for leveraging LLMs in SRL measurement and theory testing.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) offer the potential to simulate human-like responses and behaviors, creating new opportunities for psychological science. In the context of self-regulated learning (SRL), if LLMs can reliably simulate survey responses at scale and speed, they could be used to test intervention scenarios, refine theoretical models, augment sparse datasets, and represent hard-to-reach populations. However, the validity of LLM-generated survey responses remains uncertain, with limited research focused on SRL and existing studies beyond SRL yielding mixed results. Therefore, in this study, we examined LLM-generated responses to the 44-item Motivated Strategies for Learning Questionnaire (MSLQ; Pintrich &De Groot, 1990), a widely used instrument assessing students' learning strategies and academic motivation. Particularly, we used the LLMs GPT-4o, Claude 3.7 Sonnet, Gemini 2 Flash, LLaMA 3.1-8B, and Mistral Large. We analyzed item distributions, the psychological network of the theoretical SRL dimensions, and psychometric validity based on the latent factor structure. Our results suggest that Gemini 2 Flash was the most promising LLM, showing considerable sampling variability and producing underlying dimensions and theoretical relationships that align with prior theory and empirical findings. At the same time, we observed discrepancies and limitations, underscoring both the potential and current constraints of using LLMs for simulating psychological survey data and applying it in educational contexts.

Problem

Research questions and friction points this paper is trying to address.

Assessing validity of LLM-generated survey responses in psychology

Exploring LLM simulation of self-regulated learning behaviors

Evaluating psychometric properties of LLM-produced SRL data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using multiple LLMs for survey response simulation

Analyzing psychological networks of SRL dimensions

Validating psychometric properties of LLM-generated data

🔎 Similar Papers

ExploreSelf: Fostering User-driven Exploration and Reflection on Personal Challenges with Adaptive Guidance by Large Language Models

2024-09-15arXiv.orgCitations: 0

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

2024-06-25arXiv.orgCitations: 26

Authors to Follow