Seeding with Differentially Private Network Information

📅 2023-05-26

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

216K/year

🤖 AI Summary

In public health applications such as HIV prevention, complete behavioral contact networks are often unavailable; only privacy-sensitive, sequential contact samples can be obtained. Method: This paper introduces the first differentially private influence maximization seeding algorithm, supporting both centralized and local privacy models. It integrates randomized data collection, cascade-based influence estimation from sampled cascades, and rigorous theoretical analysis of estimation error bounds—ensuring performance guarantees under limited samples. Contribution/Results: Experiments show that under centralized differential privacy, algorithmic performance degrades gracefully as the privacy budget decreases; under local differential privacy, a larger budget is required to maintain effectiveness—consistent with theoretical predictions. This work provides the first solution for identifying high-impact individuals in privacy-constrained public health interventions that simultaneously offers provable theoretical guarantees and empirical efficacy.

📝 Abstract

In public health interventions such as the distribution of preexposure prophylaxis (PrEP) for HIV prevention, decision makers rely on seeding algorithms to identify key individuals who can amplify the impact of their interventions. In such cases, building a complete sexual activity network is often infeasible due to privacy concerns. Instead, contact tracing can provide influence samples, that is, sequences of sexual contacts without requiring complete network information. This presents two challenges: protecting individual privacy in contact data and adapting seeding algorithms to work effectively with incomplete network information. To solve these two problems, we study privacy guarantees for influence maximization algorithms when the social network is unknown and the inputs are samples of prior influence cascades that are collected at random and need privacy protection. Building on recent results that address seeding with costly network information, our privacy-preserving algorithms introduce randomization in the collected data or the algorithm output and can bound the privacy loss of each node (or group of nodes) in deciding to include their data in the algorithm input. We provide theoretical guarantees of seeding performance with a limited sample size subject to differential privacy budgets in both central and local privacy regimes. Simulations on synthetic random graphs and empirically grounded sexual contacts of men who have sex with men reveal the diminishing value of network information with decreasing privacy budget in both regimes and graceful decrease in performance with decreasing privacy budget in the central regime. Achieving good performance with local privacy guarantees requires relatively higher privacy budgets that confirm our theoretical expectations.

Problem

Research questions and friction points this paper is trying to address.

Protecting individual privacy in sexual contact network data

Adapting seeding algorithms for incomplete influence cascade data

Achieving differential privacy guarantees for HIV prevention interventions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using differential privacy for network data protection

Seeding algorithms adapted for incomplete influence samples

Randomization techniques to bound individual privacy loss

🔎 Similar Papers

No similar papers found.