Categorical Robustness Assessment for Machine Learning based Network Intrusion Detection Systems

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation of model robustness in existing network intrusion detection systems under adversarial attacks. Within a unified framework and using the ACI-IoT-2023 dataset, the authors conduct a cross-architecture and cross-attack robustness comparison of 1D CNN, LSTM, and Random Forest models against FGSM and PGD attacks with perturbation budgets ε ∈ [0.01, 0.1]. The findings reveal that high baseline accuracy does not guarantee strong adversarial robustness: although Random Forest achieves 99.98% accuracy under benign conditions, its performance drops by 73 percentage points under minimal perturbations. In contrast, the 1D CNN demonstrates superior robustness, maintaining 95.5% accuracy at ε = 0.01 with gradual degradation. This work is the first to systematically uncover disparities in adversarial vulnerability among mainstream models in normalized feature spaces, offering critical insights for real-world deployment.
📝 Abstract
Network Intrusion Detection Systems (NIDS) heavily utlize Machine Learning (ML) but ML models can be manipulated via adversarial attacks. These attacks add carefully crafted perturbations to network traffic data that leads to misclassifications. While prior work has demonstrated adversarial vulnerabilities in isolated settings, systematic cross-architecture as well as class and category of attack based comparisons under controlled attack conditions remain limited, leaving practitioners without clear guidance on which models to deploy in adversarial environments. This paper asks a simple question: what type of classifier architectures actually hold up when attackers try to manipulate the systems? We put three popular architectures through their paces: a 1D Convolutional Neural Network, a Long Short-Term Memory (LSTM) network, and a Random Forest (RF) ensemble. Using the ACI-IoT-2023 dataset (over 1.2 million samples spanning 12 attack types), we subject each model with FGSM and PGD adversarial attacks, which apply gradient-based perturbations in normalized feature space consistent with established adversarial ML evaluation protocols, at perturbation budgets ranging from $ε=0.01$ to $ε=0.1$. Surprisingly, Random Forest achieved near-perfect baseline accuracy (99.98\%), yet collapsed catastrophically under attack, dropping 73 percentage points at the smallest perturbation we tested. CNN, on the other hand, retained 95.5\% accuracy at $ε=0.01$ and degraded gracefully as perturbations increased. LSTM fell somewhere in between. These findings flip the conventional wisdom where high baseline accuracy means nothing if a model shatters at the first sign of adversarial pressure. For practitioners deploying intrusion detection in adversarial environments, we recommend CNN-based architectures and provide scenario-specific deployment guidance.
Problem

Research questions and friction points this paper is trying to address.

Network Intrusion Detection
Adversarial Attacks
Machine Learning Robustness
Classifier Architectures
Categorical Robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial robustness
network intrusion detection
model architecture comparison
gradient-based attacks
categorical evaluation