Selective Aggregation for Low-Rank Adaptation in Federated Learning

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 1

career value

186K/year

🤖 AI Summary

In federated learning (FL), jointly optimizing shared general knowledge and client-specific knowledge remains challenging. Method: This paper first uncovers the semantic specialization of the A/B matrices in LoRA and its variants (rsLoRA, VeRA): the A matrix encodes low-rank, cross-client shared general knowledge, while the B matrix captures client-specific deviations. Building on this insight, we propose FedSA-LoRA—a novel FL framework that selectively aggregates only the A matrices, enabling explicit decoupling and targeted fusion of knowledge. Contribution/Results: FedSA-LoRA establishes the first general-purpose, LoRA-family-oriented federated low-rank adaptation paradigm. Extensive experiments on NLU and NLG tasks demonstrate its superiority over baselines in accuracy, faster convergence, stronger generalization, and reduced communication overhead. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

We investigate LoRA in federated learning through the lens of the asymmetry analysis of the learned $A$ and $B$ matrices. In doing so, we uncover that $A$ matrices are responsible for learning general knowledge, while $B$ matrices focus on capturing client-specific knowledge. Based on this finding, we introduce Federated Share-A Low-Rank Adaptation (FedSA-LoRA), which employs two low-rank trainable matrices $A$ and $B$ to model the weight update, but only $A$ matrices are shared with the server for aggregation. Moreover, we delve into the relationship between the learned $A$ and $B$ matrices in other LoRA variants, such as rsLoRA and VeRA, revealing a consistent pattern. Consequently, we extend our FedSA-LoRA method to these LoRA variants, resulting in FedSA-rsLoRA and FedSA-VeRA. In this way, we establish a general paradigm for integrating LoRA with FL, offering guidance for future work on subsequent LoRA variants combined with FL. Extensive experimental results on natural language understanding and generation tasks demonstrate the effectiveness of the proposed method. Our code is available at https://github.com/Pengxin-Guo/FedSA-LoRA.

Problem

Research questions and friction points this paper is trying to address.

Investigates LoRA in federated learning via matrix asymmetry analysis

Proposes FedSA-LoRA to share only A matrices for aggregation

Extends FedSA-LoRA to rsLoRA and VeRA variants

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shares only A matrices for aggregation

Extends FedSA-LoRA to rsLoRA and VeRA

Models weight updates with two matrices

🔎 Similar Papers

No similar papers found.