🤖 AI Summary
This study addresses the challenge of releasing connectivity statistics involving node attributes in social and economic networks, where achieving both privacy and accuracy remains difficult due to the high global sensitivity and failure of composition in conventional differential privacy approaches. The authors propose a three-stage method under edge-adjacent differential privacy: first perturbing node attributes, then analytically debiasing downstream statistics, and finally adding a second layer of noise to edge existence indicators. This approach innovatively integrates attribute perturbation with edge-level privacy protection and, for the first time, yields consistent and asymptotically normal private estimators of connectivity indices for both discrete and continuous labels. Experiments on synthetic and real-world networks—even with as few as 200 nodes—demonstrate its superior performance over existing privacy-preserving techniques.
📝 Abstract
Researchers increasingly use data on social and economic networks to study a range of social science questions, but releasing statistics derived from networks can raise significant privacy concerns. We show how to release network connectedness indices that quantify assortative mixing across node attributes under edge-adjacent differential privacy. Standard privacy techniques perform poorly in this setting both because connectedness indices have high global sensitivity and because a single node's attribute can potentially be an input to connectedness in thousands of cells, leading to poor composition. Our method, which is straightforward to apply, first adds noise to node attributes, then analytically debiases downstream statistics, and finally applies a second layer of noise to protect the presence or absence of individual edges. We prove consistency and asymptotic normality of our estimators for both discrete and continuous labels and show our method works well in simulations and on real networks with as few as 200 nodes collected by social scientists.