Forget the Data and Fine-Tuning! Just Fold the Network to Compress

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the problem of data-free, fine-tuning-free model compression. We propose *model folding*: a novel compression paradigm that identifies redundant neurons via cross-layer structural similarity metrics and merges them using data-agnostic k-means clustering under variance-constrained optimization—preventing collapse or explosion of activations. Our method establishes the first structure-aware, statistically consistent data-free folding framework, with theoretical guarantees preserving the output distribution’s statistical properties post-compression. Experiments on ResNet-18 and LLaMA-7B demonstrate that our approach matches the compression performance of data-dependent methods and significantly outperforms existing data-free alternatives—especially at high sparsity levels. It thus provides a new pathway toward efficient, interpretable, and low-dependency model lightweighting.

Technology Category

Application Category

📝 Abstract

We introduce model folding, a novel data-free model compression technique that merges structurally similar neurons across layers, significantly reducing the model size without the need for fine-tuning or access to training data. Unlike existing methods, model folding preserves data statistics during compression by leveraging k-means clustering, and using novel data-free techniques to prevent variance collapse or explosion. Our theoretical framework and experiments across standard benchmarks, including ResNet18 and LLaMA-7B, demonstrate that model folding achieves comparable performance to data-driven compression techniques and outperforms recently proposed data-free methods, especially at high sparsity levels. This approach is particularly effective for compressing large-scale models, making it suitable for deployment in resource-constrained environments.

Problem

Research questions and friction points this paper is trying to address.

Data-free model compression technique

Merge structurally similar neurons

Preserve data statistics during compression

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-free model compression

K-means clustering integration

Large-scale model efficiency

🔎 Similar Papers

MCNC: Manifold-Constrained Reparameterization for Neural Compression