Recovering Plasticity of Neural Networks via Soft Weight Rescaling

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Unbounded weight growth during neural network training leads to plasticity loss, degrading generalization and optimization stability. To address this, we propose Soft Weight Rescaling (SWR), a lightweight technique that dynamically rescales layer-wise weights at each gradient descent step—constraining their magnitudes and equalizing inter-layer distributions—without requiring weight reinitialization. Theoretically, SWR mitigates gradient degradation and preserves the efficacy of parameter updates. Empirically, it consistently improves performance across warm-start image classification, continual learning, and single-task settings; in continual learning, it boosts average accuracy by up to 3.2% while retaining previously acquired knowledge throughout training. This work is the first to rigorously demonstrate—both theoretically and empirically—that minimal, adaptive weight rescaling suffices to sustain plasticity, obviating the need for costly reinitialization strategies.

Technology Category

Application Category

📝 Abstract
Recent studies have shown that as training progresses, neural networks gradually lose their capacity to learn new information, a phenomenon known as plasticity loss. An unbounded weight growth is one of the main causes of plasticity loss. Furthermore, it harms generalization capability and disrupts optimization dynamics. Re-initializing the network can be a solution, but it results in the loss of learned information, leading to performance drops. In this paper, we propose Soft Weight Rescaling (SWR), a novel approach that prevents unbounded weight growth without losing information. SWR recovers the plasticity of the network by simply scaling down the weight at each step of the learning process. We theoretically prove that SWR bounds weight magnitude and balances weight magnitude between layers. Our experiment shows that SWR improves performance on warm-start learning, continual learning, and single-task learning setups on standard image classification benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Preventing unbounded weight growth in neural networks
Recovering network plasticity without losing learned information
Improving performance in various learning setups
Innovation

Methods, ideas, or system contributions that make the work stand out.

Soft Weight Rescaling prevents unbounded weight growth
SWR scales down weights to recover plasticity
Balances weight magnitude between layers theoretically
🔎 Similar Papers
No similar papers found.