Latent Structure of Affective Representations in Large Language Models

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of effectively validating the geometric structure of affective representations in large language models (LLMs) and their alignment with human emotion models. It presents the first systematic application of geometric data analysis to examine whether LLM-derived emotional representations conform to the psychological valence–arousal circumplex model, while also evaluating the linear approximability of their nonlinear structure and their capacity for uncertainty quantification. The findings reveal that LLMs learn coherent affective representations highly aligned with human emotion, which are well-approximated by linear subspaces. Building on this insight, the work introduces a representation-space-based method for quantifying uncertainty in emotion-related tasks, offering empirical support for understanding LLMs’ affective reasoning mechanisms and informing their safety evaluation.
📝 Abstract
The geometric structure of latent representations in large language models (LLMs) is an active area of research, driven in part by its implications for model transparency and AI safety. Existing literature has focused mainly on general geometric and topological properties of the learnt representations, but due to a lack of ground-truth latent geometry, validating the findings of such approaches is challenging. Emotion processing provides an intriguing testbed for probing representational geometry, as emotions exhibit both categorical organization and continuous affective dimensions, which are well-established in the psychology literature. Moreover, understanding such representations carries safety relevance. In this work, we investigate the latent structure of affective representations in LLMs using geometric data analysis tools. We present three main findings. First, we show that LLMs learn coherent latent representations of affective emotions that align with widely used valence--arousal models from psychology. Second, we find that these representations exhibit nonlinear geometric structure that can nonetheless be well-approximated linearly, providing empirical support for the linear representation hypothesis commonly assumed in model transparency methods. Third, we demonstrate that the learned latent representation space can be leveraged to quantify uncertainty in emotion processing tasks. Our findings suggest that LLMs acquire affective representations with geometric structure paralleling established models of human emotion, with practical implications for model interpretability and safety.
Problem

Research questions and friction points this paper is trying to address.

latent structure
affective representations
large language models
emotion processing
representational geometry
Innovation

Methods, ideas, or system contributions that make the work stand out.

affective representations
latent geometry
large language models
valence-arousal model
geometric data analysis
🔎 Similar Papers
No similar papers found.