On the efficacy of old features for the detection of new bots

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of detecting evolving social media bots by systematically evaluating the generalizability of four mainstream feature categories—Botometer scores, user profile attributes, timeline behavioral patterns, and client metadata—across multiple recent benchmark datasets. Through comparative experiments using both supervised and unsupervised classifiers, we demonstrate that lightweight, API-accessible features—such as basic demographic signals, posting temporal patterns, and device identifiers—achieve detection performance comparable to, or even surpassing, that of complex, resource-intensive features (e.g., deep behavioral sequences or graph-structured representations). Our key contribution is the empirical validation that simple, low-cost, and easily extractable account-level features retain strong discriminative power against novel malicious bots. This finding challenges the prevailing assumption that high-fidelity bot detection necessitates sophisticated behavioral or network-based features, thereby providing both theoretical grounding and practical guidance for developing efficient, scalable, and accessible generic bot detection systems.

Technology Category

Application Category

📝 Abstract
For more than a decade now, academicians and online platform administrators have been studying solutions to the problem of bot detection. Bots are computer algorithms whose use is far from being benign: malicious bots are purposely created to distribute spam, sponsor public characters and, ultimately, induce a bias within the public opinion. To fight the bot invasion on our online ecosystem, several approaches have been implemented, mostly based on (supervised and unsupervised) classifiers, which adopt the most varied account features, from the simplest to the most expensive ones to be extracted from the raw data obtainable through the Twitter public APIs. In this exploratory study, using Twitter as a benchmark, we compare the performances of four state-of-art feature sets in detecting novel bots: one of the output scores of the popular bot detector Botometer, which considers more than 1,000 features of an account to take a decision; two feature sets based on the account profile and timeline; and the information about the Twitter client from which the user tweets. The results of our analysis, conducted on six recently released datasets of Twitter accounts, hint at the possible use of general-purpose classifiers and cheap-to-compute account features for the detection of evolved bots.
Problem

Research questions and friction points this paper is trying to address.

Evaluating old features for detecting new Twitter bots
Comparing performance of four bot detection feature sets
Exploring cost-effective methods to identify evolved bots
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compares four state-of-art bot detection feature sets
Uses general-purpose classifiers for evolved bots
Employs cheap-to-compute account features effectively
🔎 Similar Papers
No similar papers found.