🤖 AI Summary
Existing sparse training methods (e.g., iterative pruning) rely on pre-specified target sparsity levels or incur high retraining costs. This paper proposes a novel “sparse-to-dense growth” paradigm that abandons pruning entirely: edges are incrementally grown guided by path-weight products, structural bottlenecks are mitigated via stochasticity, and convergence is autonomously determined upon accuracy plateauing—enabling end-to-end discovery of the optimal operational sparsity. Its core innovation is the first growth-based sparsity exploration mechanism, eliminating the need for predefined density and enabling dynamic trade-offs between accuracy and computational complexity. Evaluated on CIFAR-10/100, TinyImageNet, and ImageNet, the method achieves performance comparable to Iterative Magnitude Pruning (IMP) lottery tickets at marginally higher sparsity, while requiring only ~1.5× the cost of dense training—just one-third to one-half that of IMP.
📝 Abstract
The lottery ticket hypothesis suggests that dense networks contain sparse subnetworks that can be trained in isolation to match full-model performance. Existing approaches-iterative pruning, dynamic sparse training, and pruning at initialization-either incur heavy retraining costs or assume the target density is fixed in advance. We introduce Path Weight Magnitude Product-biased Random growth (PWMPR), a constructive sparse-to-dense training paradigm that grows networks rather than pruning them, while automatically discovering their operating density. Starting from a sparse seed, PWMPR adds edges guided by path-kernel-inspired scores, mitigates bottlenecks via randomization, and stops when a logistic-fit rule detects plateauing accuracy. Experiments on CIFAR, TinyImageNet, and ImageNet show that PWMPR approaches the performance of IMP-derived lottery tickets-though at higher density-at substantially lower cost (~1.5x dense vs. 3-4x for IMP). These results establish growth-based density discovery as a promising paradigm that complements pruning and dynamic sparsity.