π€ AI Summary
This work studies space-efficient distinct counting in the turnstile streaming model under differential privacy. Addressing the limitation of prior approaches requiring linear space, we propose the first sublinear-space differentially private algorithm: for any dynamic stream of length $T$ over a universe of size $|U|$, it achieves additive error nearly matching the theoretical lower bound using only $ ilde{O}(T^{1/3})$ space. When element frequencies are bounded by $W$, the space further improves to $ ilde{O}(sqrt{W})$, attaining optimal errorβspace trade-offs. Our method integrates hash-based sampling, random projection, and adaptive noise injection, ensuring $(varepsilon,delta)$-differential privacy while enabling real-time, lightweight distinct count estimation. This represents a significant advance over existing linear-space solutions.
π Abstract
The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream $T$ and the size of the universe $|U|$ from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both $T$ and $|U|$. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, $ ilde{O}_{eta}(T^{1/3})$ space and additive error, and a $(1+eta)$-relative approximation for all $eta in (0,1)$. Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known $Omega(T^{1/4})$ additive error lower bound for arbitrary streams. Moreover, when a bound $W$ on the number of times an item appears in the stream is known, our algorithm provides $ ilde{O}_{eta}(sqrt{W})$ additive error, using $ ilde{O}_{eta}(sqrt{W})$ space. This additive error asymptotically matches that of prior work which required instead linear space. Our results address an open question posed by [Jain, Kalemaj, Raskhodnikova, Sivakumar, Smith, Neurips23] about designing low-memory mechanisms for this problem. We complement these results with a space lower bound for this problem, which shows that any algorithm that uses similar techniques must use space $ ilde{Omega}(T^{1/3})$ on arbitrary streams.