On the Stability of Growth in Structural Plasticity

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge that network growth mechanisms in structural plasticity often fail to effectively integrate newly inserted neurons due to weak gradient signals, leading to training instability—particularly pronounced in complex image classification tasks. The study presents the first systematic analysis of the asymmetry between growth and pruning, arguing that growth should be treated as a time-sensitive optimization process and highlighting the critical role of insertion stability for final performance. Building upon a structural plasticity framework with a convolutional backbone, the authors evaluate Grow and Prune strategies on image classification and continual learning benchmarks, introducing targeted interventions concerning optimizer state, unit insertion, and trainability. Experiments demonstrate that while Grow can achieve high accuracy in later stages of structural editing, it only becomes competitive in continual learning when newly added units are granted sufficient time for effective integration.

📝 Abstract

Standard deep-learning pipelines usually choose the network architecture before training and keep it fixed throughout optimization. In contrast, a model can also be adapted by editing its structure during training, for example by pruning existing hidden-neuron units or growing new ones. Although growth is appealing for adaptive and continual systems, we show that it is not simply the inverse of pruning. Pruning selects among units that have participated in training from the start, whereas growth inserts new units into an already specialized optimization trajectory. We isolate this insertion problem and show that newborn units are often forward-active but backward-starved: they participate in the forward computation, yet receive much weaker gradient signal than incumbent units. This disadvantage is minor in small MLP benchmarks, but becomes clear in harder image-classification settings with a convolutional trunk. In these settings, \textsc{Grow} can achieve high final accuracy during the structural-editing procedure, while \textsc{Prune} is stronger when performance is averaged over the training trajectory or when the final sparse network is retrained from scratch. Interventions targeting optimizer state, insertion, selection, and trainability show that improving the integration of newborn units can improve adaptive performance, but does not automatically produce better final subnetworks. In continual-learning benchmarks stressing plasticity loss, \textsc{Grow} becomes competitive mainly when new units have enough time to integrate. Together, these results suggest that \textsc{Grow} should be evaluated not only as an architecture-search operator, but as a time-sensitive optimization process whose success depends on insertion stability.

Problem

Research questions and friction points this paper is trying to address.

structural plasticity

network growth

gradient starvation

insertion stability

continual learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

structural plasticity

network growth

gradient starvation