Why Loss Re-weighting Works If You Stop Early: Training Dynamics of Unconstrained Features

πŸ“… 2026-01-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work investigates the role of loss reweighting in overparameterized deep neural networks under class imbalance, where it significantly enhances minority class learning early in training despite not affecting the final converged stateβ€”a mechanism previously unexplained. To address this, the authors propose an interpretable small-scale model (SSM) that integrates spectral analysis with training dynamics modeling, revealing how class imbalance manifests in the spectral structure of features. The study provides the first transparent account of how loss reweighting simultaneously optimizes feature learning for both majority and minority classes during early training phases, shifting focus from conventional convergence-centric analyses. Experiments confirm that standard training inherently favors majority classes, whereas reweighting balances early learning dynamics, offering a theoretical foundation for the observed performance gains when combined with early stopping.

Technology Category

Application Category

πŸ“ Abstract
The application of loss reweighting in modern deep learning presents a nuanced picture. While it fails to alter the terminal learning phase in overparameterized deep neural networks (DNNs) trained on high-dimensional datasets, empirical evidence consistently shows it offers significant benefits early in training. To transparently demonstrate and analyze this phenomenon, we introduce a small-scale model (SSM). This model is specifically designed to abstract the inherent complexities of both the DNN architecture and the input data, while maintaining key information about the structure of imbalance within its spectral components. On the one hand, the SSM reveals how vanilla empirical risk minimization preferentially learns to distinguish majority classes over minorities early in training, consequently delaying minority learning. In stark contrast, reweighting restores balanced learning dynamics, enabling the simultaneous learning of features associated with both majorities and minorities.
Problem

Research questions and friction points this paper is trying to address.

loss reweighting
training dynamics
class imbalance
early stopping
deep neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

loss reweighting
training dynamics
class imbalance
small-scale model
feature learning
πŸ”Ž Similar Papers
No similar papers found.