A Comprehensive Analysis of Adversarial Attacks against Spam Filters

📅 2025-05-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient robustness of deep learning–based spam filters against adversarial attacks. We systematically evaluate six mainstream models—including LSTM and BERT—under word-level, character-level, sentence-level, and AI-generated paragraph-level attacks. To enhance attack efficacy, we propose a dual-weight scoring mechanism integrating domain-informed spam weights and attention weights, enabling targeted multi-granularity adversarial example generation. Our attack framework unifies gradient-based, substitution-based, and generative strategies to induce cross-granularity perturbations. Experimental results show that all evaluated attacks achieve an average success rate exceeding 78% across models; notably, we identify, for the first time, critical failure modes under semantically invariant perturbations. The study delivers a reproducible evaluation framework and empirical evidence to guide robustness enhancement in spam detection systems.

Technology Category

Application Category

📝 Abstract
Deep learning has revolutionized email filtering, which is critical to protect users from cyber threats such as spam, malware, and phishing. However, the increasing sophistication of adversarial attacks poses a significant challenge to the effectiveness of these filters. This study investigates the impact of adversarial attacks on deep learning-based spam detection systems using real-world datasets. Six prominent deep learning models are evaluated on these datasets, analyzing attacks at the word, character sentence, and AI-generated paragraph-levels. Novel scoring functions, including spam weights and attention weights, are introduced to improve attack effectiveness. This comprehensive analysis sheds light on the vulnerabilities of spam filters and contributes to efforts to improve their security against evolving adversarial threats.
Problem

Research questions and friction points this paper is trying to address.

Analyzes adversarial attacks on deep learning spam filters
Evaluates six models against word, character, and AI-generated attacks
Introduces novel scoring functions to enhance attack detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates six deep learning models on real-world datasets
Introduces novel scoring functions like spam weights
Analyzes attacks at multiple textual levels
🔎 Similar Papers
No similar papers found.