🤖 AI Summary
This work addresses the challenge of high error in differentially private cardinality estimation under the continual observation model for fully dynamic streams, where small changes in the stream can cause large fluctuations in sensitivity. The paper presents the first systematic characterization of the ℓₚ-sensitivity vector structure inherent to differentially private streaming, and integrates this insight with an optimized counting matrix factorization mechanism to achieve high-accuracy continual estimation in streams supporting both insertions and deletions. The proposed method significantly reduces privacy error for tasks such as distinct counting, degree distribution, and triangle counting, outperforming existing approaches both theoretically and empirically across a broad range of parameter settings.
📝 Abstract
We study differentially-private statistics in the fully dynamic continual observation model, where many updates can arrive at each time step and updates to a stream can involve both insertions and deletions of an item. Earlier work (e.g., Jain et al., NeurIPS 2023 for counting distinct elements; Raskhodnikova&Steiner, PODS 2025 for triangle counting with edge updates) reduced the respective cardinality estimation problem to continual counting on the difference stream associated with the true function values on the input stream. In such reductions, a change in the original stream can cause many changes in the difference stream, this poses a challenge for applying private continual counting algorithms to obtain optimal error bounds. We improve the accuracy of several such reductions by studying the associated $\ell_p$-sensitivity vectors of the resulting difference streams and isolating their properties. We demonstrate that our framework gives improved bounds for counting distinct elements, estimating degree histograms, and estimating triangle counts (under a slightly relaxed privacy model), thus offering a general approach to private continual cardinality estimation in streaming settings. Our improved accuracy stems from tight analysis of known factorization mechanisms for the counting matrix in this setting; the key technical challenge is arguing that one can use state-of-the-art factorizations for sensitivity vector sets with the properties we isolate. Empirically and analytically, we demonstrate that our improved error bounds offer a substantial improvement in accuracy for cardinality estimation problems over a large range of parameters.