Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions

📅 2023-12-12
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the high communication overhead and poor scalability of subgraph counting under edge-local differential privacy (Edge-LDP), this paper proposes the first shared-nothing, node-level compression framework based on reproducible linear congruential hashing (LCH). The method enables lightweight edge mapping and compression at the client side, reducing per-edge communication cost to one-half of the original (s^2), while introducing a controllable variance compensation mechanism to preserve statistical accuracy. Integrated with LCH-optimized edge perturbation, subgraph sampling, and an unbiased estimator, the algorithm achieves up to 1000× lower (ell_2) error for triangle counting compared to state-of-the-art methods under identical communication budgets. This significantly enhances both the practicality and scalability of LDP-based graph analytics.
📝 Abstract
We suggest the use of hash functions to cut down the communication costs when counting subgraphs under edge local differential privacy. While various algorithms exist for computing graph statistics, including the count of subgraphs, under the edge local differential privacy, many suffer with high communication costs, making them less efficient for large graphs. Though data compression is a typical approach in differential privacy, its application in local differential privacy requires a form of compression that every node can reproduce. In our study, we introduce linear congruence hashing. With a sampling rate of $s$, our method can cut communication costs by a factor of $s^2$, albeit at the cost of increasing variance in the published graph statistic by a factor of $s$. The experimental results indicate that, when matched for communication costs, our method achieves a reduction in the $ell_2$-error for triangle counts by up to 1000 times compared to the performance of leading algorithms.
Problem

Research questions and friction points this paper is trying to address.

Reduce communication costs in subgraph counting
Address high variance in graph statistics
Improve efficiency for large graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hash functions reduce communication costs
Linear congruence hashing cuts costs by s^2
Method lowers l2-error for triangle counts
🔎 Similar Papers
No similar papers found.