Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This paper addresses three key challenges in singular value decomposition (SVD)-based compression of large language models (LLMs): difficulty in determining optimal activation truncation positions, inefficient weight reconstruction post-truncation, and inherent information loss due to SVD. To this end, we propose Dobi-SVD—the first differentiable SVD compression framework tailored for LLMs. Its core innovation lies in shifting from conventional weight-distance optimization to an **activation-oriented compression paradigm**. We introduce a **layer-adaptive activation truncation strategy** and a **gradient-aware weight reconstruction mechanism**, enabling end-to-end differentiability and training. Crucially, our design mitigates SVD-induced information injection distortion at the algorithmic level. Extensive evaluation on LLaMA-2 and LLaMA-3 demonstrates that Dobi-SVD achieves a 12% reduction in perplexity (PPL) at 4-bit equivalent precision, significantly outperforming state-of-the-art quantization and pruning baselines.

Technology Category

Application Category

📝 Abstract

We provide a new LLM-compression solution via SVD, unlocking new possibilities for LLM compression beyond quantization and pruning. We point out that the optimal use of SVD lies in truncating activations, rather than merely using activations as an optimization distance. Building on this principle, we address three critical challenges in SVD-based LLM compression: including (1) How can we determine the optimal activation truncation position for each weight matrix in LLMs? (2) How can we efficiently reconstruct the weight matrices based on truncated activations? (3) How can we address the inherent"injection"nature that results in the information loss of the SVD? We propose Dobi-SVD, which establishes a new, principled approach to SVD-based LLM compression.

Problem

Research questions and friction points this paper is trying to address.

Optimal activation truncation in SVD

Efficient weight matrix reconstruction

Mitigating SVD-induced information loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable SVD for LLM compression

Truncating activations for optimization

Efficient weight matrix reconstruction

🔎 Similar Papers

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models