Data-Free Dynamic Compression of CNNs for Tractable Efficiency

📅 2023-09-29

🏛️ Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications

📈 Citations: 0

✨ Influential: 0

📄 PDF

🤖 AI Summary

To address the high computational cost of CNN inference on resource-constrained devices—without relying on additional data or fine-tuning—this paper proposes HASTE, a data-free, training-free, plug-and-play dynamic convolution module. Its core innovation lies in leveraging Locality-Sensitive Hashing (LSH) to model channel-wise feature redundancy at runtime, enabling real-time detection and structured compression of redundant input channels and filter depths via dynamic pruning. To our knowledge, HASTE is the first fully data- and training-free method for dynamic structured pruning. Evaluated on CIFAR-10, it reduces FLOPs of ResNet-34 by 46.72% with only a 1.25% accuracy drop. Cross-dataset generalization is validated on ImageNet, demonstrating substantial improvements in edge deployment efficiency.

📝 Abstract

To reduce the computational cost of convolutional neural networks (CNNs) on resource-constrained devices, structured pruning approaches have shown promise in lowering floating-point operations (FLOPs) without substantial drops in accuracy. However, most methods require fine-tuning or specific training procedures to achieve a reasonable trade-off between retained accuracy and reduction in FLOPs, adding computational overhead and requiring training data to be available. To this end, we propose HASTE (Hashing for Tractable Efficiency), a data-free, plug-and-play convolution module that instantly reduces a network's test-time inference cost without training or fine-tuning. Our approach utilizes locality-sensitive hashing (LSH) to detect redundancies in the channel dimension of latent feature maps, compressing similar channels to reduce input and filter depth simultaneously, resulting in cheaper convolutions. We demonstrate our approach on the popular vision benchmarks CIFAR-10 and ImageNet, where we achieve a 46.72% reduction in FLOPs with only a 1.25% loss in accuracy by swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.

Problem

Research questions and friction points this paper is trying to address.

Reduces CNN computational cost on resource-constrained devices

Eliminates need for fine-tuning or specific training procedures

Uses locality-sensitive hashing to compress similar channels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-free dynamic compression using HASTE module

Locality-sensitive hashing reduces channel redundancies

Plug-and-play convolution without training or fine-tuning

🔎 Similar Papers

No similar papers found.

Authors to Follow