Missing Mass for Differentially Private Domain Discovery

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work addresses the challenge of efficiently outputting a representative subset of items in differential privacy domain discovery, where users hold subsets drawn from an unknown shared domain. The authors propose the Weighted Gaussian Mechanism (WGM) as a core method, establishing for the first time a near-optimal ℓ₁ error bound under Zipf-distributed data and providing a distribution-agnostic ℓ∞ guarantee. Building on these theoretical foundations, WGM is leveraged as a universal preprocessing module to extend existing algorithms—originally designed for known domains—for tasks including set union, top-k selection, and k-hitting set into the unknown-domain setting. Theoretical analysis demonstrates that WGM achieves near-optimal performance across multiple regimes, and empirical evaluations confirm its superiority or competitiveness against current baselines across all three tasks.

Technology Category

Application Category

📝 Abstract

We study several problems in differentially private domain discovery, where each user holds a subset of items from a shared but unknown domain, and the goal is to output an informative subset of items. For set union, we show that the simple baseline Weighted Gaussian Mechanism (WGM) has a near-optimal $\ell_1$ missing mass guarantee on Zipfian data as well as a distribution-free $\ell_\infty$ missing mass guarantee. We then apply the WGM as a domain-discovery precursor for existing known-domain algorithms for private top-$k$ and $k$-hitting set and obtain new utility guarantees for their unknown domain variants. Finally, experiments demonstrate that all of our WGM-based methods are competitive with or outperform existing baselines for all three problems.

Problem

Research questions and friction points this paper is trying to address.

differentially private domain discovery

missing mass

set union

private top-k

k-hitting set

Innovation

Methods, ideas, or system contributions that make the work stand out.

Weighted Gaussian Mechanism

missing mass

differentially private domain discovery