🤖 AI Summary
Parameter-efficient fine-tuning (PEFT) methods such as LoRA suffer from limited expressivity and inter-layer redundancy due to their layer-wise independent low-rank update structures. Method: We propose a cross-layer interconnected PEFT framework featuring a collaborative architecture with locally shared A-experts and globally shared B-experts, coupled with a data-driven router that dynamically pairs A and B submodules—thereby lifting the per-layer low-rank constraint and enabling higher effective rank in the weight增量 ΔW. Contribution/Results: Our method significantly enhances model generalization and cross-domain adaptability. Extensive experiments across multimodal tasks, diverse architectures (e.g., ViT, LLaMA, CLIP), and model scales (from 0.5B to 7B) demonstrate consistent superiority over LoRA and other baselines—achieving higher accuracy and robustness while using equal or fewer trainable parameters.
📝 Abstract
Low-rank adaptation (LoRA) is a widely used parameter-efficient fine-tuning (PEFT) method that learns weight updates $Delta W = AB$ for pretrained weights $W$ through low-rank adapters $A$ and $B$. While LoRA ensures hardware efficiency, its low-rank weight updates limit adaptation performance. In this paper, we propose low-rank interconnected adaptation across layers (Lily), a novel PEFT method that introduces an interconnected framework with locally shared $A$ and globally shared $B$ experts. This structure eliminates redundant per-layer $AB$ pairs, enabling higher-rank $Delta W$ with equal or fewer parameters. To enhance expressiveness, we use data-dependent routers to determine $A$-$B$ interconnections, preventing $B$ experts from converging to the same behavior and improving representational power across domains. Experiments across modalities, architectures, and model sizes demonstrate Lily's superior performance and efficiency. GitHub: https://github.com/yibozhong/lily