🤖 AI Summary
Existing differentially private heavy hitter detection methods for continuous data streams suffer from low efficiency—requiring noise injection over the entire sketch at each step and incurring per-query complexity Ω(|U|), rendering them infeasible for large domains.
Method: This paper proposes a lazy-update differentially private sketching framework. Its core innovations include (i) dynamically updating only local sketch regions, and (ii) introducing a rotation-based local perturbation mechanism, achieving ε-differential privacy while reducing per-step detection complexity to O(d log w).
Contribution/Results: Theoretical analysis establishes tight privacy–utility trade-off bounds, proving optimality. Experiments demonstrate up to 250× throughput improvement over state-of-the-art methods. Our framework is the first to enable high-throughput, low-latency real-time private frequency estimation and heavy hitter detection, significantly advancing the practical deployment of differential privacy in streaming settings.
📝 Abstract
Differentially private frequency estimation and heavy hitter detection are core problems in the private analysis of data streams. Two models are typically considered: the one-pass model, which outputs results only at the end of the stream, and the continual observation model, which requires releasing private summaries at every time step. While the one-pass model allows more efficient solutions, continual observation better reflects scenarios where timely and ongoing insights are critical.
In the one-pass setting, sketches have proven to be an effective tool for differentially private frequency analysis, as they can be privatized by a single injection of calibrated noise. In contrast, existing methods in the continual observation model add fresh noise to the entire sketch at every step, incurring high computational costs. This challenge is particularly acute for heavy hitter detection, where current approaches often require querying every item in the universe at each step, resulting in untenable per-update costs for large domains.
To overcome these limitations, we introduce a new differentially private sketching technique based on lazy updates, which perturbs and updates only a small, rotating part of the output sketch at each time step. This significantly reduces computational overhead while maintaining strong privacy and utility guarantees. In comparison to prior art, for frequency estimation, our method improves the update time by a factor of $O(w)$ for sketches of dimension $d imes w$; for heavy hitter detection, it reduces per-update complexity from $Ω(|U|)$ to $O(d log w)$, where $U$ is the input domain. Experiments show a increase in throughput by a factor of~$250$, making differential privacy more practical for real-time, continual observation, applications.