Privacy-preserving Chunk Scheduling in a BitTorrent Implementation of Federated Learning

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

232K/year
🤖 AI Summary
This work addresses the limitations of traditional federated learning, which relies on centralized aggregators prone to performance bottlenecks and privacy risks, as well as decentralized alternatives that may leak neighborhood relationships and dilute global information. The paper proposes FLTorrent—the first serverless federated learning framework that integrates BitTorrent’s efficient distribution with intra-round source unlinkability. By employing pre-round mixing, randomized delays, non-owner-priority scheduling, and a GreedyFastestFirst heuristic, FLTorrent enhances transmission-layer privacy while preserving FedAvg semantics. A lightweight tracker is introduced solely for scheduling, and a theoretical upper bound on membership inference is established. Experiments show that the warm-up phase consistently accounts for approximately 12% of each round; under local adversarial observation, attribution success rates approach random guessing and improve with system scale; and end-to-end overhead remains modest at 6–10% across models ranging from Gemma-7B to Llama-3.3-70B.
📝 Abstract
Traditional federated learning (FL) relies on a central aggregator server, which can create performance bottlenecks and privacy risks. Decentralized mix-and-forward designs remove the server, but repeated local mixing can attenuate global information under heterogeneity and exposes peer-to-peer neighborhoods as a privacy attack surface. To preserve FedAvg-style aggregation semantics (over updates reconstructable by the round deadline) while scaling dissemination, we present FLTorrent, a BitTorrent-based dissemination layer for serverless FL with a short warm-up. Warm-up hardens within-round source unlinkability -- a dissemination-layer goal orthogonal to content protections (e.g., DP or secure aggregation) -- via (i) pre-round obfuscation, (ii) randomized lags, and (iii) coordination-only non-owner-first scheduling (tracker off the data path), before switching to vanilla BitTorrent swarming. We upper-bound the per-transfer attribution posterior by the fraction of owner chunks in a sender's eligible cover set, and derive a tighter high-probability bound that improves with early non-owner mass. A simple heuristic, GreedyFastestFirst, attains approximately 92% of a bandwidth-optimal max-flow upper bound, while warm-up remains a stable approximately 12% share of a round across 100--500 peers. Under an observation-only local adversary, FLTorrent drives attribution success close to neighborhood-level random guessing for typical nodes, improves with network size, and remains robust under collusion. In LLM-scale stress tests (Gemma-7B, DeepSeek-R1-14B, Qwen2.5-32B, and Llama-3.3-70B) over 7--10 Gbps access links, FLTorrent adds only approximately 6--10% end-to-end overhead relative to BitTorrent-only. Overall, FLTorrent shows that within-round unlinkability and BitTorrent-level efficiency can co-exist with predictable, low overheads at scale.
Problem

Research questions and friction points this paper is trying to address.

privacy-preserving
federated learning
decentralized
unlinkability
peer-to-peer
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving dissemination
within-round unlinkability
serverless federated learning
BitTorrent-based swarming
non-owner-first scheduling
🔎 Similar Papers
N
Naicheng Li
IMDEA Networks Institute, Madrid, Spain
J
Javad Dogani
IMDEA Networks Institute, Madrid, Spain
R
Rui Wang
Department of Software Technology, Delft University of Technology, Delft, The Netherlands
K
Kaitai Liang
Department of Computing & Department of Intelligent Systems, University of Turku & Delft University of Technology, Turku, Finland & Delft, The Netherlands
Nikolaos Laoutaris
Nikolaos Laoutaris
IMDEA Networks Institute
Computer networksInternetPrivacyEconomics of networksEconomics of data