🤖 AI Summary
Current neural connectivity modeling in large language models (LLMs) inadequately captures functional neuron co-activation patterns, hindering interpretability and safety.
Method: We propose “Graph Probing,” a novel framework to characterize functional topology among neurons via graph-based representation learning.
Contribution/Results: Graph Probing reveals—within just eight training steps—a highly sparse, predictive, and architecture-agnostic neural topology that emerges universally across diverse LLMs (varying in architecture, scale, and training data). Through graph neural representation analysis, cross-model topological alignment, and sparse subgraph evaluation, we demonstrate that merely 1% of neuron connections suffice to robustly predict next-token prediction performance. Our findings uncover a shared topological regularity underlying LLMs’ emergent capabilities and their language generation efficacy. We publicly release the Graph Probing toolbox, establishing a new paradigm for mechanistic understanding and controllable optimization of foundation models.
📝 Abstract
Probing large language models (LLMs) has yielded valuable insights into their internal mechanisms by linking neural representations to interpretable semantics. However, how neurons functionally co-activate with each other to give rise to emergent capabilities remains largely unknown, hindering a deeper understanding and safer development of LLMs. In this work, we introduce graph probing, a method for uncovering the functional connectivity topology of LLM neurons and relating it to language generation performance. By analyzing internal neural graphs across diverse LLM families and scales, we discover a universal predictability of next-token prediction performance using only neural topology. This predictability is robust even when retaining just 1% of neuron connections or probing models after only 8 pretraining steps, highlighting the sparsity and early emergence of topological patterns. Further graph matching analysis suggests that, despite significant distinctions in architectures, parameters, and training data, different LLMs develop intricate and consistent neural topological structures that may form the foundation for their language generation abilities. Codes and data for the graph probing toolbox are released at https://github.com/DavyMorgan/llm-graph-probing.