SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation

📅 2025-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scalability and resource-efficiency limitations of existing parameter-efficient fine-tuning (PEFT) methods—such as LoRA—in large language models, this paper introduces the Discrete Hartley Transform (DHT) into PEFT for the first time, proposing Spectral-Sparse Adapter (SSA). SSA identifies and retains the most discriminative sparse spectral components per layer in the DHT domain and reconstructs model parameters accurately in the spatial domain via lightweight inverse DHT, jointly optimizing spectral sparsity selection and spatial reconstruction. Evaluated on language understanding/generation and video-text multimodal tasks, SSA significantly outperforms LoRA and other baselines: it reduces trainable parameters by over 60%, substantially lowers GPU memory consumption and computational overhead during training, while maintaining—or even improving—model performance.

Technology Category

Application Category

📝 Abstract
Low-rank adaptation (LoRA) has been demonstrated effective in reducing the trainable parameter number when fine-tuning a large foundation model (LLM). However, it still encounters computational and memory challenges when scaling to larger models or addressing more complex task adaptation. In this work, we introduce Sparse Spectrum Adaptation via Discrete Hartley Transformation (SSH), a novel approach that significantly reduces the number of trainable parameters while enhancing model performance. It selects the most informative spectral components across all layers, under the guidance of the initial weights after a discrete Hartley transformation (DHT). The lightweight inverse DHT then projects the spectrum back into the spatial domain for updates. Extensive experiments across both single-modality tasks such as language understanding and generation and multi-modality tasks such as video-text understanding demonstrate that SSH outperforms existing parameter-efficient fine-tuning (PEFT) methods while achieving substantial reductions in computational cost and memory requirements.
Problem

Research questions and friction points this paper is trying to address.

Reduces trainable parameters in LLMs
Enhances model performance via DHT
Lowers computational and memory costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Hartley Transformation
Sparse Spectrum Adaptation
Reduced trainable parameters
🔎 Similar Papers
No similar papers found.