The anonymization problem in social networks

📅 2024-09-24

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This paper addresses the *k*-anonymization problem on social network graphs, aiming to maximize the number of nodes satisfying the *k*-anonymity condition—i.e., each such node must have at least *k*−1 structurally equivalent peers—via structural modifications, primarily edge deletions. We introduce and systematically formulate three novel optimization variants: full *k*-anonymization, partial *k*-anonymization, and budget-constrained *k*-anonymization. Methodologically, we propose a structural-uniqueness-driven edge deletion strategy, surpassing conventional heuristics, and rigorously analyze how anonymity metric selection critically governs the privacy–utility trade-off. Leveraging structural equivalence, we design a reusable computational framework integrating four new heuristic algorithms. Experiments demonstrate that our optimal algorithm retains, on average, 14× more edges than baselines under full *k*-anonymization, and yields 4.8× more *k*-anonymous nodes under budget constraints—significantly improving the balance between privacy protection and graph utility.

Technology Category

Application Category

📝 Abstract

In this paper we introduce a general version of the anonymization problem in social networks, in which the goal is to maximize the number of anonymous nodes by altering a given graph. We define three variants of this optimization problem being full, partial and budgeted anonymization. In each, the objective is to maximize the number of k-anonymous nodes, i.e., nodes for which there are at least k-1 equivalent nodes, according to a particular anonymity measure of structural node equivalence. We propose four new heuristic algorithms for solving the anonymization problem which we implement into a reusable computational framework. As a baseline, we use an edge sampling method introduced in previous work. Experiments on both graph models and 23 real-world network datasets result in three empirical findings. First, we demonstrate that edge deletion is the most effective graph alteration operation. Second, we compare four commonly used anonymity measures from the literature and highlight how the choice of anonymity measure has a tremendous effect on both the initial anonymity as well as the difficulty of solving the anonymization problem. Third, we find that the proposed algorithm that preferentially deletes edges with a larger effect on nodes at a structurally unique position consistently outperforms heuristics solely based on network structure. Our best performing algorithm retains on average 14 times more edges in full anonymization, and overall ensures a better trade-off between anonymity and data utility. In the budgeted variant, it achieves 4.8 times more anonymous nodes than the baseline. This work lays foundations for future development of algorithms for anonymizing social networks.

Problem

Research questions and friction points this paper is trying to address.

Maximize anonymous nodes in social networks via graph alteration

Compare effectiveness of edge deletion versus other anonymization methods

Evaluate impact of anonymity measures on problem difficulty and results

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes four heuristic algorithms for anonymization

Uses edge deletion as most effective alteration

Focuses on k-anonymous nodes maximization

🔎 Similar Papers

A systematic comparison of measures for k-anonymity in networks

2024-07-02arXiv.orgCitations: 1

💼 Related Jobs

Research Scientist