Number Representations in LLMs: A Computational Parallel to Human Perception

📅 2025-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models (LLMs) internally represent numerical values using a human-like logarithmic encoding—characterized by finer discrimination for small magnitudes and compressed representation for large ones—mirroring the sublinear psychophysical scaling of the human mental number line. Methodologically, we systematically analyze numeral embeddings across layers using PCA/PLS dimensionality reduction, geometric regression, and cross-layer embedding space modeling. Our results reveal a robust, layer-wise non-uniform numerical encoding: inter-embedding distances strictly follow a logarithmic function (R² > 0.98). Crucially, this structure emerges without explicit numerical training, indicating spontaneous convergence toward human-aligned computational principles. This work provides the first cross-species computational evidence of parallelism between LLMs and human numerical cognition, advancing our understanding of symbol grounding and cognitive emergence in foundation models.

Technology Category

Application Category

📝 Abstract
Humans are believed to perceive numbers on a logarithmic mental number line, where smaller values are represented with greater resolution than larger ones. This cognitive bias, supported by neuroscience and behavioral studies, suggests that numerical magnitudes are processed in a sublinear fashion rather than on a uniform linear scale. Inspired by this hypothesis, we investigate whether large language models (LLMs) exhibit a similar logarithmic-like structure in their internal numerical representations. By analyzing how numerical values are encoded across different layers of LLMs, we apply dimensionality reduction techniques such as PCA and PLS followed by geometric regression to uncover latent structures in the learned embeddings. Our findings reveal that the model's numerical representations exhibit sublinear spacing, with distances between values aligning with a logarithmic scale. This suggests that LLMs, much like humans, may encode numbers in a compressed, non-uniform manner.
Problem

Research questions and friction points this paper is trying to address.

Investigates numerical representation in LLMs.
Compares LLM number encoding to human perception.
Uses PCA and PLS to analyze embeddings.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Logarithmic-like numerical representations
Dimensionality reduction techniques used
Sublinear spacing in LLMs
🔎 Similar Papers
No similar papers found.
H
H. V. AlquBoj
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
H
Hilal AlQuabeh
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
V
Velibor Bojkovic
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Tatsuya Hiraoka
Tatsuya Hiraoka
Mohamed bin Zayed University of Artificial Intelligence
Natural Language Processing
Ahmed Oumar El-Shangiti
Ahmed Oumar El-Shangiti
Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) | Riken AIP
Machine learningNatural language processingInterpretabilityGenerative AIMachine Translation
M
Munachiso Nwadike
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Kentaro Inui
Kentaro Inui
MBZUAI, Tohoku University, RIKEN
natural language processingcomputational linguisticsLLM/LMM interpretability