Evaluating and Combating the Impact of Concept Drift on the Performance of Machine Learning-Based Phishing Detection Systems

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the performance degradation of machine learning–based phishing email detection systems caused by concept drift. It systematically evaluates the real-world impact of evolving phishing attacks on detection models and proposes targeted mitigation mechanisms. By integrating email content feature engineering, classification models, and concept drift detection with adaptive retraining strategies, the work demonstrates the vulnerability of existing systems under dynamic threat landscapes using real-world datasets. Experimental results show that the proposed approach effectively maintains high detection accuracy and significantly enhances system robustness in the face of continuously evolving phishing campaigns.

📝 Abstract

The expansion of the digital domain has resulted in a substantial increase in digital communication, with email emerging as one of the most prominent channels. The proliferation of email communication is apparent in both professional and personal contexts, thereby creating numerous vulnerabilities for malicious actors to exploit. Spam emails, a form of unsolicited correspondence often bearing malicious intent towards recipients, have been an ongoing challenge for email users since the inception of email technology, and this problem has been exacerbated by the growth of the digital landscape. Email spam filters are integral components of email clients, engineered to identify potentially harmful messages and alert users to their malicious content. Phishing, frequently the initial phase of malware-based attacks, is evolving rapidly, with malware becoming increasingly sophisticated over time. A widely adopted approach for detecting malicious activity within malware and spam domains is the application of machine learning. Our aim is to assess the impact of the evolution within the spam email domain on these machine learning-based detection systems and to explore strategies for mitigating associated performance degradation.

Problem

Research questions and friction points this paper is trying to address.

concept drift

phishing detection

machine learning

email spam

performance degradation

Innovation

Methods, ideas, or system contributions that make the work stand out.

concept drift

phishing detection

machine learning