Boosting Bot Detection via Heterophily-Aware Representation Learning and Prototype-Guided Cluster Discovery

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Social bot detection suffers from strong label dependency, poor cross-community generalization, and limitations of existing generative graph self-supervised methods—which overly rely on the homophily assumption and fail to capture global heterophilous patterns. Method: We propose BotHP, the first framework integrating heterogeneity-aware representation learning with prototype-guided cluster discovery to transcend homophily constraints and model semantically coherent yet distributionally dispersed bot clusters. It employs a dual-encoder architecture, heterogeneity-enhanced contrastive learning, and generative graph self-supervised pretraining for robust weakly supervised detection. Contributions/Results: Evaluated on two real-world datasets, BotHP significantly boosts performance across diverse graph-based detectors, reduces labeling requirements, enhances cross-community generalization, and demonstrates resilience against interaction-based adversarial camouflage attacks.

Technology Category

Application Category

📝 Abstract

Detecting social media bots is essential for maintaining the security and trustworthiness of social networks. While contemporary graph-based detection methods demonstrate promising results, their practical application is limited by label reliance and poor generalization capability across diverse communities. Generative Graph Self-Supervised Learning (GSL) presents a promising paradigm to overcome these limitations, yet existing approaches predominantly follow the homophily assumption and fail to capture the global patterns in the graph, which potentially diminishes their effectiveness when facing the challenges of interaction camouflage and distributed deployment in bot detection scenarios. To this end, we propose BotHP, a generative GSL framework tailored to boost graph-based bot detectors through heterophily-aware representation learning and prototype-guided cluster discovery. Specifically, BotHP leverages a dual-encoder architecture, consisting of a graph-aware encoder to capture node commonality and a graph-agnostic encoder to preserve node uniqueness. This enables the simultaneous modeling of both homophily and heterophily, effectively countering the interaction camouflage issue. Additionally, BotHP incorporates a prototype-guided cluster discovery pretext task to model the latent global consistency of bot clusters and identify spatially dispersed yet semantically aligned bot collectives. Extensive experiments on two real-world bot detection benchmarks demonstrate that BotHP consistently boosts graph-based bot detectors, improving detection performance, alleviating label reliance, and enhancing generalization capability.

Problem

Research questions and friction points this paper is trying to address.

Detecting social media bots to enhance network security

Overcoming label reliance and poor generalization in bot detection

Addressing interaction camouflage and distributed bot deployment challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterophily-aware representation learning for bot detection

Dual-encoder architecture modeling homophily and heterophily

Prototype-guided cluster discovery for global consistency

🔎 Similar Papers

Detecting Financial Bots on the Ethereum Blockchain

2024-03-28The Web ConferenceCitations: 1

ByteDance

西雅图

ML Researcher, Autonomous Security

Apple

Seattle, United States of America

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)