Convergent transformations of visual representation in brains and models

📅 2025-07-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Does visual perception arise from the structure of the external world or from intrinsic brain architecture? Method: Leveraging three independent fMRI datasets, we employed inter-subject functional alignment and layer-wise model–brain mapping to construct a unified representational flow tracking framework, comparing human neural representations with those of deep neural networks (DNNs) and language models. Contribution/Results: We identify a conserved dual-path functional organization in human visual cortex, with highly consistent representational trajectories across individuals. Vision-specific DNNs—unlike language models—faithfully recapitulate this cross-subject convergence pattern. Critically, this alignment is driven by statistical regularities in natural visual stimuli, confirming shared encoding principles between biological and artificial visual systems. This work provides the first evidence of hierarchical and inter-individual computational convergence in perceptual representation across biological and artificial systems.

Technology Category

Application Category

📝 Abstract
A fundamental question in cognitive neuroscience is what shapes visual perception: the external world's structure or the brain's internal architecture. Although some perceptual variability can be traced to individual differences, brain responses to naturalistic stimuli evoke similar activity patterns across individuals, suggesting a convergent representational principle. Here, we test if this stimulus-driven convergence follows a common trajectory across people and deep neural networks (DNNs) during its transformation from sensory to high-level internal representations. We introduce a unified framework that traces representational flow by combining inter-subject similarity with alignment to model hierarchies. Applying this framework to three independent fMRI datasets of visual scene perception, we reveal a cortex-wide network, conserved across individuals, organized into two pathways: a medial-ventral stream for scene structure and a lateral-dorsal stream tuned for social and biological content. This functional organization is captured by the hierarchies of vision DNNs but not language models, reinforcing the specificity of the visual-to-semantic transformation. These findings show a convergent computational solution for visual encoding in both human and artificial vision, driven by the structure of the external world.
Problem

Research questions and friction points this paper is trying to address.

What shapes visual perception: external world or brain architecture
Convergent representational principle in brains and deep neural networks
Functional organization of visual encoding in human and artificial vision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines inter-subject similarity with model alignment
Identifies two conserved cortical visual pathways
Validates DNNs mirror human visual hierarchy
🔎 Similar Papers
No similar papers found.