Learning to Collaborate: An Orchestrated-Decentralized Framework for Peer-to-Peer LLM Federation

📅 2026-01-23

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the challenge of efficiently adapting large language models (LLMs) across organizations while preserving data privacy and sovereignty, and avoiding single points of failure and model inversion risks inherent in conventional federated learning. The authors propose KNEXA-FL, a novel framework that introduces a learning-based adaptive orchestration mechanism, formulating decentralized peer-to-peer collaboration as a contextual bandit problem. By leveraging the LinUCB algorithm, KNEXA-FL dynamically matches optimal collaboration pairs based on agent abstract profiles to enable secure knowledge distillation. The approach supports intelligent coordination among heterogeneous PEFT-LLM agents and demonstrates substantial improvements in code generation tasks, achieving approximately a 50% gain in Pass@1 over random collaboration and significantly outperforming strong centralized baselines while ensuring stable and convergent training.

Technology Category

Application Category

📝 Abstract

Fine-tuning Large Language Models (LLMs) for specialized domains is constrained by a fundamental challenge: the need for diverse, cross-organizational data conflicts with the principles of data privacy and sovereignty. While Federated Learning (FL) provides a framework for collaboration without raw data exchange, its classic centralized form introduces a single point of failure and remains vulnerable to model inversion attacks. Decentralized FL (DFL) mitigates this risk by removing the central aggregator but typically relies on inefficient, random peer-to-peer (P2P) pairings, forming a collaboration graph that is blind to agent heterogeneity and risks negative transfer. This paper introduces KNEXA-FL, a novel framework for orchestrated decentralization that resolves this trade-off. KNEXA-FL employs a non-aggregating Central Profiler/Matchmaker (CPM) that formulates P2P collaboration as a contextual bandit problem, using a LinUCB algorithm on abstract agent profiles to learn an optimal matchmaking policy. It orchestrates direct knowledge exchange between heterogeneous, PEFT-based LLM agents via secure distillation, without ever accessing the models themselves. Our comprehensive experiments on a challenging code generation task show that KNEXA-FL yields substantial gains, improving Pass@1 by approx. 50% relative to random P2P collaboration. Critically, our orchestrated approach demonstrates stable convergence, in stark contrast to a powerful centralized distillation baseline which suffers from catastrophic performance collapse. Our work establishes adaptive, learning-based orchestration as a foundational principle for building robust and effective decentralized AI ecosystems.

Problem

Research questions and friction points this paper is trying to address.

Federated Learning

Data Privacy

Decentralized Collaboration

Large Language Models

Peer-to-Peer

Innovation

Methods, ideas, or system contributions that make the work stand out.

orchestrated decentralization

contextual bandit

peer-to-peer LLM federation