๐ค AI Summary
Addressing two critical challenges in Industry 4.0โdifficult detection of zero-day attacks and extreme class imbalance in network traffic data (minority-class prevalence as low as 0.000004%)โthis paper proposes a lightweight intrusion detection method integrating an enhanced SMOTE-ENN data augmentation technique with a deep neural network (DNN). Specifically, the Edited Nearest Neighbor (ENN) rule is innovatively embedded into the SMOTE oversampling pipeline to improve the quality and validity of synthetic minority samples. Additionally, a DNN classifier is designed to accommodate highly skewed distributions, effectively mitigating overfitting. Evaluated on the original imbalanced test set, the method achieves substantial gains in minority-class recall and F1-score while maintaining superior overall accuracy compared to state-of-the-art baselines. It also demonstrates enhanced generalization capability. This end-to-end solution offers both efficiency and robustness for zero-day attack detection in industrial cyber-physical systems.
๐ Abstract
Cyberattack detection in Critical Infrastructure and Supply Chains has become challenging in Industry 4.0. Intrusion Detection Systems (IDS) are deployed to counter the cyberattacks. However, an IDS effectively detects attacks based on the known signatures and patterns, Zero-day attacks go undetected. To overcome this drawback in IDS, the integration of a Dense Neural Network (DNN) with Data Augmentation is proposed. It makes IDS intelligent and enables it to self-learn with high accuracy when a novel attack is encountered. The network flow captures datasets are highly imbalanced same as the real network itself. The Data Augmentation plays a crucial role in balancing the data. The balancing of data is challenging as the minority class is as low as 0.000004% of the dataset, and the abundant class is higher than 80% of the dataset. Synthetic Minority Oversampling Technique is used for balancing the data. However, higher accuracies are achieved with balanced test data, lower accuracies are noticeable with the original imbalanced test data suggesting overfitting. A comparison with state-of-the-art research using Synthetic Minority Oversampling Technique with Edited Nearest Neighbor shows the classification of classes remains poor for the original dataset. This suggests highly imbalanced datasets of network flow require a different method of data augmentation.