🤖 AI Summary
Magecart malicious scripts pose severe threats to online payment client security and user trust. To address this, we propose a detection method that jointly ensures strong adversarial robustness and high interpretability. Our approach models the dynamic execution structure of malicious scripts using a Behavioral Deterministic Finite Automaton (DFA), enabling semantic-aware feature representation. We further design a multi-model framework integrating tree-based, linear, and kernel methods, guided by DFA-derived features and enhanced via adversarial training, with an adaptive perturbation evaluation mechanism. Experiments on real-world web traffic achieve 98.2% detection accuracy—significantly outperforming baseline methods. Moreover, DFA path backtracking provides concise, verifiable decision explanations, while the method maintains over 95% accuracy under FGSM and PGD attacks. This work thus advances the state of the art in balancing security, transparency, and practical deployability for Magecart detection.
📝 Abstract
Magecart skimming attacks have emerged as a significant threat to client-side security and user trust in online payment systems. This paper addresses the challenge of achieving robust and explainable detection of Magecart attacks through a comparative study of various Machine Learning (ML) models with a real-world dataset. Tree-based, linear, and kernel-based models were applied, further enhanced through hyperparameter tuning and feature selection, to distinguish between benign and malicious scripts. Such models are supported by a Behavior Deterministic Finite Automaton (DFA) which captures structural behavior patterns in scripts, helping to analyze and classify client-side script execution logs. To ensure robustness against adversarial evasion attacks, the ML models were adversarially trained and evaluated using attacks from the Adversarial Robustness Toolbox and the Adaptative Perturbation Pattern Method. In addition, concise explanations of ML model decisions are provided, supporting transparency and user trust. Experimental validation demonstrated high detection performance and interpretable reasoning, demonstrating that traditional ML models can be effective in real-world web security contexts.