The Unsampled Truth: Psychometrics in SLMs Measure Prompt Artifacts, Not Psychological Constructs

πŸ“… 2026-06-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

178K/year
πŸ€– AI Summary
This study addresses whether the outputs of small language models (SLMs) in psychometric tasks stem from genuine semantic reasoning or are primarily driven by artifacts of prompt formulation. The authors propose the first diagnostic framework capable of disentangling the influence of such prompt artifacts, systematically manipulating role framing, instructions, item content, and option labels while employing controlled experiments and variance decomposition techniques to quantify the relative contributions of semantic signals versus prompt-induced artifacts. Findings reveal that prompt artifacts frequently dominate model responses, substantially undermining their psychometric validity. The proposed framework not only effectively identifies these confounding influences but also offers a novel pathway for evaluating and enhancing the semantic comprehension capabilities of large language models.
πŸ“ Abstract
When prompting SLMs for psychometric assessments, researchers assume the outputs reflect semantic reasoning. We evaluate this premise across 13 open-weights models (0.6B to 14B parameters) using a prompt variation framework that separates semantic signals from prompt artifacts. By systematically varying personas, instructions, items, and option symbols, we find that artifactual variance frequently overpowers the semantic signal. In these cases, models predominantly reflect prompt compliance rather than simulated psychological traits. While these findings limit SLM utility in psychometrics, our framework provides a diagnostic tool to identify destructive artifacts and isolate semantic understanding for future frontier-model research.
Problem

Research questions and friction points this paper is trying to address.

psychometrics
small language models
prompt artifacts
semantic reasoning
psychological constructs
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt artifacts
psychometrics
semantic signal
prompt variation framework
small language models