🤖 AI Summary
Current deep learning theory lacks a unified explanatory framework for key dynamical phenomena—neural collapse, emergence, lazy/feature-learning regimes, sudden understanding, and generalization phase transitions. To address this, we propose a cross-scale dynamical framework grounded in layer-wise linear models as analytically tractable primitives, integrated with dynamical systems analysis, the neural tangent kernel (NTK) formalism, and linearized stability theory. Crucially, we generalize feedback mechanisms—previously studied only in simplified models—to full-scale deep networks for the first time, enabling unified modeling and analytical characterization of all five phenomena. Our framework reveals intrinsic connections among these behaviors and endows neural network dynamics with analytic tractability, predictability, and designability. It establishes a novel theoretical paradigm for foundational deep learning research.
📝 Abstract
In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other's evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.