🤖 AI Summary
This paper addresses the challenge of identifying and localizing hidden nodes—specifically Sybil users, peripheral users, and critical influencers—in unknown social networks. We formally define, for the first time, a systematic three-category “hidden friend” discovery task. Methodologically, we propose a graph exploration framework based on progressive subgraph sampling, integrating both embedding-based and non-embedding models. To enable adaptive model selection and robust prediction under partial observability, we design a feedback-driven multi-armed bandit (MAB) online coordination strategy. Experiments on real-world social graphs demonstrate that our approach incurs only 1.2× and 1.4× the query cost required for discovering 10% and 90% of hidden nodes, respectively, compared to full-topology knowledge. The MAB strategy significantly outperforms any single-model baseline and exhibits strong cross-scenario generalization capability.
📝 Abstract
In this paper, we address the challenge of discovering hidden nodes in unknown social networks, formulating three types of hidden-node discovery problems, namely, Sybil-node discovery, peripheral-node discovery, and influencer discovery. We tackle these problems by employing a graph exploration framework grounded in machine learning. Leveraging the structure of the subgraph gradually obtained from graph exploration, we construct prediction models to identify target hidden nodes in unknown social graphs. Through empirical investigations of real social graphs, we investigate the efficiency of graph exploration strategies in uncovering hidden nodes. Our results show that our graph exploration strategies discover hidden nodes with an efficiency comparable to that when the graph structure is known. Specifically, the query cost of discovering 10% of the hidden nodes is at most only 1.2 times that when the topology is known, and the query-cost multiplier for discovering 90% of the hidden nodes is at most only 1.4. Furthermore, our results suggest that using node embeddings, which are low-dimensional vector representations of nodes, for hidden-node discovery is a double-edged sword: it is effective in certain scenarios but sometimes degrades the efficiency of node discovery. Guided by this observation, we examine the effectiveness of using a bandit algorithm to combine the prediction models that use node embeddings with those that do not, and our analysis shows that the bandit-based graph exploration strategy achieves efficient node discovery across a wide array of settings.