The Capacity of Information-Theoretic Secure Aggregation in Federated Learning

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the fundamental challenge of achieving information-theoretic security in federated learning without a trusted third party or pre-established key infrastructure. It proposes a general two-phase framework—comprising key distribution and secure aggregation—that leverages pairwise user communication to establish arbitrary shared keys and jointly optimizes the trade-offs among key distribution, aggregation communication, and randomness consumption. The paper fully characterizes the achievable capacity region for the first time and presents a deterministic construction that relies solely on pairwise shared keys (e.g., via Diffie–Hellman) and operates over any finite field whose size is at least the number of users. Compared to Google’s seminal scheme, this approach significantly reduces the secret mask key overhead while maintaining identical aggregation communication costs, and it establishes that pairwise keys alone are sufficient to attain optimal performance.

📝 Abstract

Secure aggregation allows a server to aggregate users' local updates while preserving update privacy. Existing information-theoretic problems typically assume that correlated random keys are provided by a trusted third party (TTP) or generated via prescribed groupwise structures, while the communication cost for establishing such correlated keys is often ignored. Consequently, the fundamental limits under general key-distribution mechanisms remain unknown. In this paper, we study the $T$-colluding information-theoretic secure aggregation problem with $N$ users under a general two-phase framework consisting of a key distribution phase and an update aggregation phase. Unlike prior work, we model key distribution through user-to-user communication and allow arbitrary user-generated key-distribution mechanisms, eliminating TTP or prescribed structures. This enables a joint characterization of three resources: randomness for security, key-distribution communication, and aggregation communication. We completely characterize the capacity region among these three resources by constructing a novel secure aggregation scheme together with a matching information-theoretic converse. In particular, we develop an explicit deterministic capacity-achieving construction over any finite field of size at least $N$, whereas most existing schemes either rely on TTP or employ randomized or existential constructions over sufficiently large finite fields. We further show that the optimal performance can be achieved using only pairwise shared keys, enabling implementation via Diffie--Hellman key exchange. Compared with Google's seminal secure aggregation scheme, the proposed scheme requires fewer random masking keys while preserving the same aggregation communication overhead.

Problem

Research questions and friction points this paper is trying to address.

secure aggregation

federated learning

information-theoretic security

key distribution

communication cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

secure aggregation

information-theoretic security

key distribution