🤖 AI Summary
To address the challenge of simultaneously achieving high sparsity and high accuracy in convolutional neural network pruning, this paper proposes the Cyclic Overlapping Lottery Ticket (COLT) mechanism. COLT integrates data sharding, cyclic retraining from scratch, and intersection-based weight extraction across multiple subnetworks to produce highly sparse yet accurate subnetworks in a single pruning pipeline. It introduces, for the first time, a cyclic overlapping weight selection paradigm that relaxes the conventional Lottery Ticket Hypothesis (LTH) requirement for repeated iterative pruning and retraining, thereby significantly reducing computational overhead. Moreover, COLT exhibits strong cross-dataset transferability and generalization capability. On CIFAR-10, CIFAR-100, and TinyImageNet, COLT achieves higher sparsity than baseline methods while preserving original model accuracy, reduces the number of iterations by over 50% compared to Iterative Magnitude Pruning (IMP), and outperforms current state-of-the-art pruning approaches.
📝 Abstract
Pruning refers to the elimination of trivial weights from neural networks. The sub-networks within an overparameterized model produced after pruning are often called Lottery tickets. This research aims to generate winning lottery tickets from a set of lottery tickets that can achieve similar accuracy to the original unpruned network. We introduce a novel winning ticket called Cyclic Overlapping Lottery Ticket (COLT) by data splitting and cyclic retraining of the pruned network from scratch. We apply a cyclic pruning algorithm that keeps only the overlapping weights of different pruned models trained on different data segments. Our results demonstrate that COLT can achieve similar accuracies (obtained by the unpruned model) while maintaining high sparsities. We show that the accuracy of COLT is on par with the winning tickets of Lottery Ticket Hypothesis (LTH) and, at times, is better. Moreover, COLTs can be generated using fewer iterations than tickets generated by the popular Iterative Magnitude Pruning (IMP) method. In addition, we also notice COLTs generated on large datasets can be transferred to small ones without compromising performance, demonstrating its generalizing capability. We conduct all our experiments on Cifar-10, Cifar-100&TinyImageNet datasets and report superior performance than the state-of-the-art methods.