MGDIL: Multi-Granularity Summarization and Domain-Invariant Learning for Cross-Domain Social Bot Detection

πŸ“… 2026-03-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing social bot detection methods suffer from limited generalization under modality gaps, incomplete inputs, or out-of-distribution samples, struggling to handle cross-domain scenarios and novel camouflage strategies. To address this, this work proposes a large language model–based multi-granularity semantic summarization framework that unifies heterogeneous signals into textual representations. The approach integrates task-oriented instruction tuning, domain-adversarial learning, and cross-domain contrastive learning to jointly optimize domain-invariant yet discriminative feature representations. By uniquely combining multi-granularity summarization with domain-invariant learning, the method significantly improves detection performance across multiple cross-dataset and cross-temporal evaluations, effectively enhancing distribution alignment, intra-class compactness, and inter-class separability.
πŸ“ Abstract
Social bots increasingly infiltrate online platforms through sophisticated disguises, threatening healthy information ecosystems. Existing detection methods often rely on modality specific cues or local contextual features, making them brittle when modalities are missing or inputs are incomplete. Moreover, most approaches assume similar train test distributions, which limits their robustness to out of distribution (OOD) samples and emerging bot types. To address these challenges, we propose Multi Granularity Summarization and Domain Invariant Learning (MGDIL), a unified framework for robust social bot detection under domain shift. MGDIL first transforms heterogeneous signals into unified textual representations through LLM based multi granularity summarization. Building on these representations, we design a collaborative optimization framework that integrates task oriented LLM instruction tuning with domain invariant representation learning. Specifically, task oriented instruction tuning enhances the LLMs ability to capture subtle semantic cues and implicit camouflage patterns, while domain adversarial learning and cross domain contrastive learning are jointly employed to mitigate distribution shifts across datasets and time periods. Through this joint optimization, MGDIL learns stable and discriminative domain invariant features, improving cross domain social bot detection through better distribution alignment, stronger intra class compactness, and clearer inter class separation.
Problem

Research questions and friction points this paper is trying to address.

social bot detection
cross-domain
out-of-distribution
domain shift
robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-granularity summarization
domain-invariant learning
LLM instruction tuning
cross-domain contrastive learning
social bot detection
πŸ”Ž Similar Papers
No similar papers found.
Boyu Qiao
Boyu Qiao
PhD, Information Engineering Institute, Chinese Academy of Sciences
Social bot detectionNature Language Process
Y
Yunman Chen
Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences
Kun Li
Kun Li
Institute of Information Engineering, Chinese Academy of Sciences, China
W
Wei Zhou
Institute of Information Engineering, Chinese Academy of Sciences
S
Songlin Hu
Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences
Y
Yunya Song
Hong Kong University of Science and Technology