Flexible Automatic Identification and Removal (FAIR)-Pruner: An Efficient Neural Network Pruning Method

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of deploying large language models on resource-constrained edge devices, this paper proposes FAIR-Pruner—a one-shot, structured pruning method that requires no post-training fine-tuning. Methodologically, it jointly leverages a Wasserstein-distance-driven unit utilization score and Taylor-expansion-based reconstruction error to simultaneously quantify parameter importance and its impact on accuracy. Additionally, it introduces a differential tolerance mechanism that automatically determines layer-wise pruning ratios, enabling flexible, configuration-free sparsity allocation. Evaluated on ImageNet with models including VGG, FAIR-Pruner achieves significant reductions in model size and computational cost (FLOPs), while incurring negligible accuracy degradation. Crucially, it eliminates the need for any fine-tuning—enhancing deployment efficiency, automation, and practicality for edge scenarios.

Technology Category

Application Category

📝 Abstract
Neural network pruning is a critical compression technique that facilitates the deployment of large-scale neural networks on resource-constrained edge devices, typically by identifying and eliminating redundant or insignificant parameters to reduce computational and memory overhead. This paper proposes the Flexible Automatic Identification and Removal (FAIR)-Pruner, a novel method for neural network structured pruning. Specifically, FAIR-Pruner first evaluates the importance of each unit (e.g., neuron or channel) through the Utilization Score quantified by the Wasserstein distance. To reflect the performance degradation after unit removal, it then introduces the Reconstruction Error, which is computed via the Taylor expansion of the loss function. Finally, FAIR-Pruner identifies superfluous units with negligible impact on model performance by controlling the proposed Tolerance of Difference, which measures differences between unimportant units and those that cause performance degradation. A major advantage of FAIR-Pruner lies in its capacity to automatically determine the layer-wise pruning rates, which yields a more efficient subnetwork structure compared to applying a uniform pruning rate. Another advantage of the FAIR-Pruner is its great one-shot performance without post-pruning fine-tuning. Furthermore, with utilization scores and reconstruction errors, users can flexibly obtain pruned models under different pruning ratios. Comprehensive experimental validation on diverse benchmark datasets (e.g., ImageNet) and various neural network architectures (e.g., VGG) demonstrates that FAIR-Pruner achieves significant model compression while maintaining high accuracy.
Problem

Research questions and friction points this paper is trying to address.

Efficiently prunes neural networks for edge devices
Automatically determines layer-wise pruning rates
Maintains high accuracy without fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Wasserstein distance for unit importance evaluation
Employs Taylor expansion to compute reconstruction error
Automatically determines layer-wise pruning rates
🔎 Similar Papers
No similar papers found.
C
Chenqing Lin
School of Statistics and Mathematics, Zhejiang Gongshang University, China
M
Mostafa Hussien
École de Technologie Supérieure (ÉTS), University of Quebec, Canada
Chengyao Yu
Chengyao Yu
Southern University of Science & Technology
StatisticsUncertainty quantification
Mohamed Cheriet
Mohamed Cheriet
Full Professor, ÉTS (U. of Quebec), Former Canada Research Chair, SYNCHROMEDIA Lab, CIRODD
Pattern RecognitionMachine LearningImage ProcessingSustainable Intell Cloud & NetworksEnergy
O
Osama Abdelrahman
Department of Statistics, Mathematics, and Insurance, Faculty of Commerce, Assiut University, Asyut, Egypt
R
Ruixing Ming
School of Statistics and Mathematics, Zhejiang Gongshang University, China