Data-Free Dynamic Compression of CNNs for Tractable Efficiency

📅 2023-09-29
🏛️ Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost of CNN inference on resource-constrained devices—without relying on additional data or fine-tuning—this paper proposes HASTE, a data-free, training-free, plug-and-play dynamic convolution module. Its core innovation lies in leveraging Locality-Sensitive Hashing (LSH) to model channel-wise feature redundancy at runtime, enabling real-time detection and structured compression of redundant input channels and filter depths via dynamic pruning. To our knowledge, HASTE is the first fully data- and training-free method for dynamic structured pruning. Evaluated on CIFAR-10, it reduces FLOPs of ResNet-34 by 46.72% with only a 1.25% accuracy drop. Cross-dataset generalization is validated on ImageNet, demonstrating substantial improvements in edge deployment efficiency.
📝 Abstract
To reduce the computational cost of convolutional neural networks (CNNs) on resource-constrained devices, structured pruning approaches have shown promise in lowering floating-point operations (FLOPs) without substantial drops in accuracy. However, most methods require fine-tuning or specific training procedures to achieve a reasonable trade-off between retained accuracy and reduction in FLOPs, adding computational overhead and requiring training data to be available. To this end, we propose HASTE (Hashing for Tractable Efficiency), a data-free, plug-and-play convolution module that instantly reduces a network's test-time inference cost without training or fine-tuning. Our approach utilizes locality-sensitive hashing (LSH) to detect redundancies in the channel dimension of latent feature maps, compressing similar channels to reduce input and filter depth simultaneously, resulting in cheaper convolutions. We demonstrate our approach on the popular vision benchmarks CIFAR-10 and ImageNet, where we achieve a 46.72% reduction in FLOPs with only a 1.25% loss in accuracy by swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
Problem

Research questions and friction points this paper is trying to address.

Reduces CNN computational cost on resource-constrained devices
Eliminates need for fine-tuning or specific training procedures
Uses locality-sensitive hashing to compress similar channels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-free dynamic compression using HASTE module
Locality-sensitive hashing reduces channel redundancies
Plug-and-play convolution without training or fine-tuning
🔎 Similar Papers
No similar papers found.
Lukas Meiner
Lukas Meiner
Robert Bosch GmbH & University of Lübeck
Deep LearningEfficient Neural NetworksModel Compression
Jens Mehnert
Jens Mehnert
Robert Bosch GmbH
StochasticsEfficient AIEmbedded AIModel CompressionScaling AI
A
A. Condurache
Automated Driving Research, Robert Bosch GmbH, 70469 Stuttgart, Germany; Institute for Signal Processing, University of Lübeck, 23562 Lübeck, Germany