5% > 100%: Flatness Preference is All You Need for Multimodal Parameter-Efficient Fine-Tuning

📅 2026-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The generalization mechanisms underlying multimodal parameter-efficient fine-tuning (PEFT) remain poorly understood. This work reveals, for the first time, a prevalent “flatness preference” phenomenon in PEFT: generalization performance is predominantly governed by only a few sharp optimization dimensions. Building on this insight, we propose Flatness Preference Optimization (FlatPO), a method that selectively smooths these critical dimensions. Remarkably, FlatPO achieves substantial improvements in model generalization while updating merely ~5% of the parameters. Extensive experiments demonstrate that FlatPO consistently outperforms full fine-tuning across multiple multimodal downstream tasks, thereby challenging conventional fine-tuning paradigms and offering a novel pathway toward highly efficient adaptation of large models.
📝 Abstract
Parameter-Efficient Fine-Tuning (PEFT) methods provide a streamlined and efficient tool for adapting large models to domain-specific multimodal downstream tasks. Although these methods proved their tangible effects in practice, their principal aspects remain under-explored. Therefore we remain curious about the underlying generalization mechanisms in various PEFT methods and how they can be further enhanced. In this paper, we reveal the flatness preference widely present in various PEFTs, where a small fraction of sharp dimensions dominates the generalization of PEFT. This finding suggests an appealing possibility: we may be satisfied with a better generalization by merely attending to this small fraction of sharp dimensions instead of all of them. Furthermore, we propose Flatness Preference Optimization (FlatPO) to flatten these key sharpness dimensions, leading various PEFTs toward better generalization. Extensive experiments demonstrate the effectiveness of our findings and the proposed method. Code is available at https://github.com/Can-Lin/FlatPO.
Problem

Research questions and friction points this paper is trying to address.

Parameter-Efficient Fine-Tuning
Multimodal
Generalization
Flatness Preference
Sharpness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flatness Preference
Parameter-Efficient Fine-Tuning
Multimodal Learning
Generalization
FlatPO