π€ AI Summary
This work addresses a critical limitation of traditional k-anonymity methods, which unrealistically assume that adversaries possess exact knowledge of the network structure, thereby compromising practical privacy guarantees. To overcome this, the authors propose the Ο-k-anonymity model, which formally captures the uncertainty in an attackerβs structural knowledge through a fuzziness parameter Ο. They further develop a greedy anonymization algorithm constrained by an edge-modification budget. Extensive experiments on 39 real-world networks demonstrate that with Ο = 5%, the method anonymizes 64% of originally unique nodes on average; at Ο = 10%, the greedy algorithm anonymizes over 99% of nodes while preserving key structural properties and downstream task performance within a 5% deviation. This approach thus significantly enhances privacy protection without substantially sacrificing data utility.
π Abstract
With the introduction of large-scale network data, including population-scale social networks, techniques for privacy-aware sharing of network data become increasingly important. While existing $k$-anonymity approaches can model different attacker scenarios, they typically assume that attacker knowledge exactly matches the published network structure. We argue that exact knowledge is often unrealistic and introduce $Ο$-$k$-anonymity, a fuzzy variant of $k$-anonymity in which parameter $Ο$ captures the level of uncertainty in attacker knowledge. Across a benchmark of $39$ real-world networks, a realistic level of uncertainty ($Ο=5\%$) renders, on average, $64\%$ of previously unique nodes anonymous. To further enhance anonymity, we apply anonymization algorithms under a $5\%$ edge modification budget. While full anonymization is often unattainable under exact $k$-anonymity, with low uncertainty ($Ο=10\%$) our newly proposed Greedy algorithm anonymizes over $99\%$ of the nodes. Uncertainty also enables effective anonymization in otherwise difficult to anonymize dense synthetic graphs. Additionally, data utility in terms of structural properties and performance on network analysis tasks is well preserved, with most metrics changing less than $5\%$. Overall, our findings suggest that modest uncertainty assumptions yield high levels of anonymity and utility, motivating further research on uncertainty-aware privacy guarantees for network data.