🤖 AI Summary
Conventional pruning methods suffer from severe accuracy collapse at high sparsity levels, failing to meet stringent hardware constraints on model size. To address this, we propose a bidirectional pruning-regeneration framework that departs from traditional unidirectional pruning: it first applies aggressive structured pruning, then dynamically restores critical connections based on importance estimation and performance feedback. This iterative co-optimization of pruning and selective connection regeneration effectively mitigates accuracy degradation under extreme compression. Experiments demonstrate that our method achieves an average accuracy improvement of 4.2% over state-of-the-art approaches at equivalent sparsity levels. Notably, on ResNet-50, it attains 95% sparsity while retaining over 98% of the original accuracy—substantially outperforming existing pruning techniques. The proposed framework establishes a new paradigm for deploying highly accurate, ultra-sparse models on resource-constrained edge devices.
📝 Abstract
As a widely adopted model compression technique, model pruning has demonstrated strong effectiveness across various architectures. However, we observe that when sparsity exceeds a certain threshold, both iterative and one-shot pruning methods lead to a steep decline in model performance. This rapid degradation limits the achievable compression ratio and prevents models from meeting the stringent size constraints required by certain hardware platforms, rendering them inoperable. To overcome this limitation, we propose a bidirectional pruning-regrowth strategy. Starting from an extremely compressed network that satisfies hardware constraints, the method selectively regenerates critical connections to recover lost performance, effectively mitigating the sharp accuracy drop commonly observed under high sparsity conditions.