Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model

πŸ“… 2025-05-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work studies space-efficient distinct counting in the turnstile streaming model under differential privacy. Addressing the limitation of prior approaches requiring linear space, we propose the first sublinear-space differentially private algorithm: for any dynamic stream of length $T$ over a universe of size $|U|$, it achieves additive error nearly matching the theoretical lower bound using only $ ilde{O}(T^{1/3})$ space. When element frequencies are bounded by $W$, the space further improves to $ ilde{O}(sqrt{W})$, attaining optimal error–space trade-offs. Our method integrates hash-based sampling, random projection, and adaptive noise injection, ensuring $(varepsilon,delta)$-differential privacy while enabling real-time, lightweight distinct count estimation. This represents a significant advance over existing linear-space solutions.

Technology Category

Application Category

πŸ“ Abstract
The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream $T$ and the size of the universe $|U|$ from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both $T$ and $|U|$. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, $ ilde{O}_{eta}(T^{1/3})$ space and additive error, and a $(1+eta)$-relative approximation for all $eta in (0,1)$. Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known $Omega(T^{1/4})$ additive error lower bound for arbitrary streams. Moreover, when a bound $W$ on the number of times an item appears in the stream is known, our algorithm provides $ ilde{O}_{eta}(sqrt{W})$ additive error, using $ ilde{O}_{eta}(sqrt{W})$ space. This additive error asymptotically matches that of prior work which required instead linear space. Our results address an open question posed by [Jain, Kalemaj, Raskhodnikova, Sivakumar, Smith, Neurips23] about designing low-memory mechanisms for this problem. We complement these results with a space lower bound for this problem, which shows that any algorithm that uses similar techniques must use space $ ilde{Omega}(T^{1/3})$ on arbitrary streams.
Problem

Research questions and friction points this paper is trying to address.

Develops space-efficient algorithms for distinct element counting
Ensures differential privacy in turnstile streaming model
Achieves sublinear space and improved error bounds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sublinear space for distinct elements counting
Differentially private turnstile streaming algorithms
Improved space efficiency with additive error
πŸ”Ž Similar Papers
No similar papers found.