🤖 AI Summary
This work investigates how the Fisher Information Matrix (FIM) governs function-space learning dynamics of two-layer ReLU networks with infinite width and random hidden-layer weights. Using asymptotic analysis and approximate spectral decomposition of the FIM, we derive explicit asymptotic forms for the basis functions associated with four distinct classes of FIM eigenvectors—revealing that gradient descent in parameter space induces approximately orthogonal evolution in function space, as dictated by the FIM. Combining theoretical derivation with numerical simulations, we precisely predict the convergence profiles of these basis functions, thereby unifying the interpretation of optimization trajectories, generalization bias, and expressive capacity in wide networks. The key contribution is a novel FIM-driven parameter-to-function space mapping paradigm, providing fundamental theoretical insight into the intrinsic geometric structure of deep learning models.
📝 Abstract
We investigate the function space dynamics of a two-layer ReLU neural network in the infinite-width limit, highlighting the Fisher information matrix (FIM)'s role in steering learning. Extending seminal works on approximate eigendecomposition of the FIM, we derive the asymptotic behavior of basis functions ($f_v(x) = X^{ op} v $) for four groups of approximate eigenvectors, showing their convergence to distinct function forms. These functions, prioritized by gradient descent, exhibit FIM-induced inner products that approximate orthogonality in the function space, forging a novel connection between parameter and function spaces. Simulations validate the accuracy of these theoretical approximations, confirming their practical relevance. By refining the function space inner product's role, we advance the theoretical framework for ReLU networks, illuminating their optimization and expressivity. Overall, this work offers a robust foundation for understanding wide neural networks and enhances insights into scalable deep learning architectures, paving the way for improved design and analysis of neural networks.