The Impact of Anisotropic Covariance Structure on the Training Dynamics and Generalization Error of Linear Networks

📅 2026-01-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work investigates the impact of data anisotropy—particularly spiked covariance structures—on the training dynamics and generalization performance of two-layer linear networks. By leveraging random matrix theory and dynamical systems analysis, it reveals for the first time a two-phase evolution in the learning process: an initial rapid fitting along the leading principal components of the data, followed by a refined alignment with the target task. Building on this characterization, the study derives an analytical expression for the generalization error, quantifying how the alignment between data principal components and the target function critically enhances performance. The analysis further elucidates how anisotropic data structures steer the optimization trajectory and ultimately improve generalization.

Technology Category

Application Category

📝 Abstract

The success of deep neural networks largely depends on the statistical structure of the training data. While learning dynamics and generalization on isotropic data are well-established, the impact of pronounced anisotropy on these crucial aspects is not yet fully understood. We examine the impact of data anisotropy, represented by a spiked covariance structure, a canonical yet tractable model, on the learning dynamics and generalization error of a two-layer linear network in a linear regression setting. Our analysis reveals that the learning dynamics proceed in two distinct phases, governed initially by the input-output correlation and subsequently by other principal directions of the data structure. Furthermore, we derive an analytical expression for the generalization error, quantifying how the alignment of the spike structure of the data with the learning task improves performance. Our findings offer deep theoretical insights into how data anisotropy shapes the learning trajectory and final performance, providing a foundation for understanding complex interactions in more advanced network architectures.

Problem

Research questions and friction points this paper is trying to address.

anisotropic covariance

training dynamics

generalization error

linear networks

spiked covariance

Innovation

Methods, ideas, or system contributions that make the work stand out.

anisotropic covariance

learning dynamics

generalization error

spiked covariance model

linear neural networks

🔎 Similar Papers

No similar papers found.

Authors to Follow