🤖 AI Summary
This work addresses the significant gap between current large language models and human cognitive mechanisms in high-order reasoning, which limits robust generalization. To bridge this divide, the study pioneers a shift from correlational to proactive use of task-based fMRI neural signals by modeling brain activity in reasoning-related regions through a neuro-predictability metric. It introduces brain-guided interventions at both the representation and training stages—specifically, brain-informed representation alignment and fine-tuning. This approach transcends the constraints of conventional language-only supervision, yielding up to a 13% average improvement in reasoning accuracy across ten large language models of varying scales. Notably, the gains are orthogonal to existing language-supervision techniques and demonstrate strong generalization across diverse reasoning types.
📝 Abstract
The correspondence between large language models (LLMs) and the neural mechanisms underlying human higher-order cognition remains insufficiently characterized. Given that language and reasoning in the human brain appear dissociable, an open question is whether LLMs align with neural signals from reasoning-related regions and whether such signals can improve them. Here, focusing on deductive reasoning, we show that LLM internal representations are not only partially aligned with task-fMRI activity but can also be directly enhanced by these signals. Using a neural-predictivity metric, we find that LLMs explain a substantial fraction of the explainable variance in reasoning-related regions at the aggregate level, whereas predictivity within specific reasoning types is lower, indicating both alignment and divergence. Building on this, we propose a brain-guided framework: we steer model representations along directions induced by the joint structure of model and brain representations, applying intervention at inference and fine-tuning during training. We demonstrate that task-evoked brain signals can directly enhance LLM reasoning, yielding gains orthogonal to language-only supervision across 10 LLMs (1.5B-72B), with transfer across reasoning types and up to 13\% absolute accuracy gain. Our results advance LLM-brain correspondences from correlation to guidance, establishing a brain-signal-driven pathway toward more robust and cognitively aligned AI.