🤖 AI Summary
Cross-organizational federated learning (FL) faces a fundamental trade-off between trust establishment and communication/resource overhead: existing approaches either rely on untrusted third-party aggregators or incur prohibitive costs via all-to-all direct model exchange. This paper proposes the first trust-aware, decentralized cross-domain FL framework, integrating blockchain-inspired consensus, peer-to-peer model exchange, a lightweight verification protocol, and elastic synchronization scheduling—enabling secure, third-party-free collaboration. Departing from conventional centralized aggregation and full-mesh connectivity paradigms, our framework supports both synchronous and asynchronous coordination modes and leverages distributed storage. Experiments across multi-institutional platforms demonstrate performance on par with ideal centralized FL in terms of model accuracy and convergence speed, while reducing communication overhead by 37% and improving straggler tolerance by a factor of three.
📝 Abstract
Federated Learning (FL) is a decentralized machine learning (ML) paradigm in which models are trained on private data across several devices called clients and combined at a single node called an aggregator rather than aggregating the data itself. Many organizations employ FL to have better privacy-aware ML-driven decision-making capabilities. However, organizations often operate independently rather than collaborate to enhance their FL capabilities due to the lack of an effective mechanism for collaboration. The challenge lies in balancing trust and resource efficiency. One approach relies on trusting a third-party aggregator to consolidate models from all organizations (multilevel FL), but this requires trusting an entity that may be biased or unreliable. Alternatively, organizations can bypass a third party by sharing their local models directly, which requires significant computational resources for validation. Both approaches reflect a fundamental trade-off between trust and resource constraints, with neither offering an ideal solution. In this work, we develop a trust-based cross-silo FL framework called proj, which uses decentralized orchestration and distributed storage. proj provides flexibility to the participating organizations and presents synchronous and asynchronous modes to handle stragglers. Our evaluation on a diverse testbed shows that proj achieves a performance comparable to the ideal multilevel centralized FL while allowing trust and optimal use of resources.