Differentially Private Hierarchical Heavy Hitters

📅 2026-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently identifying hierarchical heavy hitters (HHHs) under rigorous differential privacy guarantees. The authors propose tailored privacy-preserving algorithms for both streaming and non-streaming data settings, effectively controlling frequency estimation error while strictly adhering to differential privacy. Their key contributions include the first differentially private mechanism for HHH release; in the non-streaming setting, they establish that the relative error of residual counts is independent of both hierarchy depth and the number of heavy hitters; in the streaming setting, they decouple absolute error from available memory space. Experimental results demonstrate that the proposed methods significantly outperform existing approaches in balancing privacy preservation and accuracy.
📝 Abstract
The task of finding _Hierarchical_ Heavy Hitters (HHH) was introduced by Cormode et al. [VLDB 2003] as a generalisation of the heavy hitter problem. While finding HHH in data streams has been studied extensively, the question of releasing HHH when the underlying data is private remains unexplored. In this paper, we study differentially private HHH release in both the streaming and non-streaming setting. In the non-streaming setting, we show the surprising result that the relative error in estimating the residual count for any prefix is independent of the height of the hierarchy and the number of heavy hitters in the stream. Meanwhile, in the streaming setting, although the exact version of HHH has low global sensitivity (as counting queries are 1-sensitive), the approximation functions due to streaming have high global sensitivity, linear in the available space. Despite this obstacle, we show that the absolute error for estimating frequencies in the steaming setting is independent of the available space.
Problem

Research questions and friction points this paper is trying to address.

Differential Privacy
Hierarchical Heavy Hitters
Private Data Release
Streaming Algorithms
Frequency Estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy
Hierarchical Heavy Hitters
Streaming Algorithms
Global Sensitivity
Residual Count Estimation
💼 Related Jobs